ghazel at gmail.com
2011-Jan-13 21:20 UTC
Unicorn 3.3.1 "Too many open files" error in kgio_tryaccept
Strange error which took down my server today: Unhandled listen loop exception #<Errno::EMFILE: Too many open files - accept>. /usr/local/ruby-enterprise-1.8.7-2010.01/lib/ruby/gems/1.8/gems/unicorn-3.3.1/lib/unicorn/http_server.rb:606:in `kgio_tryaccept'' /usr/local/ruby-enterprise-1.8.7-2010.01/lib/ruby/gems/1.8/gems/unicorn-3.3.1/lib/unicorn/http_server.rb:606:in `worker_loop'' /usr/local/ruby-enterprise-1.8.7-2010.01/lib/ruby/gems/1.8/gems/unicorn-3.3.1/lib/unicorn/http_server.rb:605:in `each'' /usr/local/ruby-enterprise-1.8.7-2010.01/lib/ruby/gems/1.8/gems/unicorn-3.3.1/lib/unicorn/http_server.rb:605:in `worker_loop'' ... My unicorn.stderr.log is 7.6GB, and I rotate it every day... Before all that started I got: Read error: #<IOError: closed stream> /usr/local/ruby-enterprise-1.8.7-2010.01/lib/ruby/gems/1.8/gems/unicorn-3.3.1/lib/unicorn/http_response.rb:37:in `write'' /usr/local/ruby-enterprise-1.8.7-2010.01/lib/ruby/gems/1.8/gems/unicorn-3.3.1/lib/unicorn/http_response.rb:37:in `http_response_write'' /usr/local/ruby-enterprise-1.8.7-2010.01/lib/ruby/gems/1.8/gems/unicorn-3.3.1/lib/unicorn/http_server.rb:540:in `process_client'' /usr/local/ruby-enterprise-1.8.7-2010.01/lib/ruby/gems/1.8/gems/unicorn-3.3.1/lib/unicorn/http_server.rb:607:in `worker_loop'' /usr/local/ruby-enterprise-1.8.7-2010.01/lib/ruby/gems/1.8/gems/unicorn-3.3.1/lib/unicorn/http_server.rb:605:in `each'' /usr/local/ruby-enterprise-1.8.7-2010.01/lib/ruby/gems/1.8/gems/unicorn-3.3.1/lib/unicorn/http_server.rb:605:in `worker_loop'' But the servers seemed to be basically operating at that point. -Greg
Eric Wong
2011-Jan-13 23:06 UTC
Unicorn 3.3.1 "Too many open files" error in kgio_tryaccept
ghazel at gmail.com wrote:> Strange error which took down my server today: > > Unhandled listen loop exception #<Errno::EMFILE: Too many open files - accept>. > ... > > My unicorn.stderr.log is 7.6GB, and I rotate it every day... > > Before all that started I got: > > Read error: #<IOError: closed stream> > /usr/local/ruby-enterprise-1.8.7-2010.01/lib/ruby/gems/1.8/gems/unicorn-3.3.1/lib/unicorn/http_response.rb:37:in > > But the servers seemed to be basically operating at that point.Weird. I have many questions: Are you processing uploads at all? Is nginx in front of Unicorn? Do you open a lot of files in your application and never close them?[1] How does lsof output look for your workers? Can you check any exception capturing/logging middleware you have for any other errors that may be swallowed? (especially while reading env["rack.input"]) I changed the socket close ordering for 3.3.1 to workaround an esoteric "bug", but I don''t see how this could happen right now... Does this happen with 3.3.0? Or 1.1.6? hiswhich has the same "bugfix" as 3.3.1. Also, is anybody else having this problem? [1] - GC should be able to cleanup open files on MRI/REE anyways -- Eric Wong
ghazel at gmail.com
2011-Jan-14 02:17 UTC
Unicorn 3.3.1 "Too many open files" error in kgio_tryaccept
On Thu, Jan 13, 2011 at 3:06 PM, Eric Wong <normalperson at yhbt.net> wrote:> ghazel at gmail.com wrote: >> Strange error which took down my server today: >> >> Unhandled listen loop exception #<Errno::EMFILE: Too many open files - accept>. >> ... >> >> My unicorn.stderr.log is 7.6GB, and I rotate it every day... >> >> Before all that started I got: >> >> Read error: #<IOError: closed stream> >> /usr/local/ruby-enterprise-1.8.7-2010.01/lib/ruby/gems/1.8/gems/unicorn-3.3.1/lib/unicorn/http_response.rb:37:in >> >> But the servers seemed to be basically operating at that point. > > Weird. ?I have many questions: > > Are you processing uploads at all?Yes, but I have been processing uploads on Unicorn since 0.99.something> Is nginx in front of Unicorn?Yes.> Do you open a lot of files in your application and never close them?[1]I don''t believe so, and certain nothing has changed there in quite awhile.> How does lsof output look for your workers?Hm. The workers seem to be at 80-90 file descriptors each. I did catch one at 787 (!) with mostly handles to a geoip database from the geoip gem, but they got collected pretty quickly. Perhaps that''s the cause!> Can you check any exception capturing/logging middleware you have > for any other errors that may be swallowed? (especially > while reading env["rack.input"])I don''t have any non-standard Rails 2.3.10 middleware which reads rack.input.> I changed the socket close ordering for 3.3.1 to workaround an esoteric > "bug", but I don''t see how this could happen right now... > > Does this happen with 3.3.0?I''ve only seen it once. If I manage to reproduce it I''ll try with 3.3.0. -Greg
Eric Wong
2011-Jan-14 02:47 UTC
Unicorn 3.3.1 "Too many open files" error in kgio_tryaccept
ghazel at gmail.com wrote:> On Thu, Jan 13, 2011 at 3:06 PM, Eric Wong <normalperson at yhbt.net> wrote: > > ghazel at gmail.com wrote: > > How does lsof output look for your workers? > > Hm. The workers seem to be at 80-90 file descriptors each. I did catch > one at 787 (!) with mostly handles to a geoip database from the geoip > gem, but they got collected pretty quickly. Perhaps that''s the cause!OK, that''s a fairly likely cause of EMFILE. A tip for geoip users: Install the io-extra gem to get IO.pread. This allows you to reuse the same file descriptor with geoip automatically between any number of threads/processes without reopening it.> > Does this happen with 3.3.0? > > I''ve only seen it once. If I manage to reproduce it I''ll try with 3.3.0.The 3.3.x series generates less garbage than previous versions, so it could be causing the GC to run less frequently and increase the chance of the GC failing to collect descriptors. -- Eric Wong
ghazel at gmail.com
2011-Jan-14 04:20 UTC
Unicorn 3.3.1 "Too many open files" error in kgio_tryaccept
On Thu, Jan 13, 2011 at 6:47 PM, Eric Wong <normalperson at yhbt.net> wrote:> ghazel at gmail.com wrote: >> On Thu, Jan 13, 2011 at 3:06 PM, Eric Wong <normalperson at yhbt.net> wrote: >> > ghazel at gmail.com wrote: >> > How does lsof output look for your workers? >> >> Hm. The workers seem to be at 80-90 file descriptors each. I did catch >> one at 787 (!) with mostly handles to a geoip database from the geoip >> gem, but they got collected pretty quickly. Perhaps that''s the cause! > > OK, that''s a fairly likely cause of EMFILE. > > A tip for geoip users: > > ?Install the io-extra gem to get IO.pread. ?This allows you to reuse > ?the same file descriptor with geoip automatically between any number > ?of threads/processes without reopening it.Fascinating. Google tells be you''ve been down this road before. My code previously was doing: def get_city(ip) GeoIP.new("/path/to/geo.dat").city(ip) end Which seems to create one fd per call (and leave it for the GC to cleanup). What''s the new proposed interface if io-extra is installed? Keep a global GeoIP object somewhere? My fix was to stuff it in to Thread.current, but obviously that has one fd per thread. -Greg
Eric Wong
2011-Jan-14 09:17 UTC
Unicorn 3.3.1 "Too many open files" error in kgio_tryaccept
ghazel at gmail.com wrote:> On Thu, Jan 13, 2011 at 6:47 PM, Eric Wong <normalperson at yhbt.net> wrote: > > ghazel at gmail.com wrote: > >> On Thu, Jan 13, 2011 at 3:06 PM, Eric Wong <normalperson at yhbt.net> wrote: > >> > ghazel at gmail.com wrote: > >> > How does lsof output look for your workers? > >> > >> Hm. The workers seem to be at 80-90 file descriptors each. I did catch > >> one at 787 (!) with mostly handles to a geoip database from the geoip > >> gem, but they got collected pretty quickly. Perhaps that''s the cause! > > > > OK, that''s a fairly likely cause of EMFILE. > > > > A tip for geoip users: > > > > ?Install the io-extra gem to get IO.pread. ?This allows you to reuse > > ?the same file descriptor with geoip automatically between any number > > ?of threads/processes without reopening it. > > Fascinating. Google tells be you''ve been down this road before. My > code previously was doing: > def get_city(ip) > GeoIP.new("/path/to/geo.dat").city(ip) > end > > Which seems to create one fd per call (and leave it for the GC to > cleanup). What''s the new proposed interface if io-extra is installed? > Keep a global GeoIP object somewhere? My fix was to stuff it in to > Thread.current, but obviously that has one fd per thread.Yes, with io-extra, just define a constant somewhere and use it in any thread/process at any time. No need to deal with locks or after_fork hooks in Unicorn at all, either, pread() is just that awesome :) GEO_DB = GeoIP.new("/path/to/geo.dat") def get_city(ip) GEO_DB.city(ip) end -- Eric Wong
Maybe Matching Threads
- [PATCH] close client socket after closing response body
- [PATCH] replace fchmod()-based heartbeat with raindrops
- Rack content-length Rack::Lint::LintErrors errors with unicorn
- [PATCH] Start the server if another user has a PID matching our stale pidfile.
- Handling closed clients