thr3ads.net - mongrel unicorn - Any signal other than -9 causes full CPU utilization by master unicorn process on FreeBSD [Jul 2012]

If this information is useful, please help other people find it:
Share via:

Mark Mccraw

2012-Jul-17 00:33 UTC

Any signal other than -9 causes full CPU utilization by master unicorn process on FreeBSD

Hi There!

I''m having a devil of a time figuring out a weird issue I''m
running into.  I have unicorn configured to start 4 worker processes, and that
works great.  However, when it''s time to cycle the app, everything goes
haywire. By trial and error, I have narrowed it down to this:  sending any
signal to the master process other than SIGKILL fails miserably.  No new master
process is created, as described in the documentation, nothing happens to the
existing workers, nothing gets written to any log, and if I run top -u, I can
see that very quickly the master ramps up to 100% CPU utilization.  This happens
if I run ''kill -HUP <master pid>'', ''kill -USR2
<master pid>'', even ''kill -QUIT <master
pid>''.

Here''s what I''m running on:

uname -a
FreeBSD bb20web04.unx.sas.com 9.0-RELEASE FreeBSD 9.0-RELEASE #0: Tue Jan  3
07:46:30 UTC 2012     root at
farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  amd64

ruby -v
ruby 1.9.3p0 (2011-10-30 revision 33570) [amd64-freebsd9]

gem list | grep unicorn
unicorn (4.3.1)

My unicorn.rb file is pasted at the bottom.  It should be noted that I have
tried every permutation of this I can think of to narrow out the problematic
part (set preload_app to false, comment out preload_app), comment out
before_exec, before_fork, after_fork, comment out the START_CTX[0] bit, etc),
but things always fail the same way, so I''m guessing it''s not
the config, but I''m open to anything.

Any suggestions at all are greatly appreciated.  I''d love to know how
to interrupt the master process when it''s slamming the CPU and get a
stack trace, but I have no idea how in ruby.  Any thoughts?

Thanks!
Mark


APP_ROOT="/usr/local/rails/partsdb/current"
working_directory APP_ROOT
pid "#{APP_ROOT}/tmp/pids/unicorn.pid"
stderr_path "#{APP_ROOT}/log/unicorn.log"
stdout_path "#{APP_ROOT}/log/unicorn.log"
Unicorn::HttpServer::START_CTX[0] = "#{APP_ROOT}/bin/unicorn"
rails_env = ENV[''RAILS_ENV''] || ''production''
worker_processes 4
timeout 120

# Speed up worker spawn times
preload_app true

listen "/tmp/unicorn.sock", :backlog => 10
listen "bb20web04:8080", :backlog => 1024

before_exec do |server|
  ENV["BUNDLE_GEMFILE"] = "#{APP_ROOT}/Gemfile"
end

before_fork do |server, worker|
  ##
  # When sent a USR2, Unicorn will suffix its pidfile with .oldbin and
  # immediately start loading up a new version of itself (loaded with a new
  # version of our app). When this new Unicorn is completely loaded
  # it will begin spawning workers. The first worker spawned will check to
  # see if an .oldbin pidfile exists. If so, this means we''ve just
booted up
  # a new Unicorn and need to tell the old one that it can now die. To do so
  # we send it a QUIT.
  #
  # Using this method we get 0 downtime deploys.
  if defined?(ActiveRecord::Base)
    ActiveRecord::Base.connection.disconnect!
  end
  old_pid = APP_ROOT + ''/tmp/pids/unicorn.pid.oldbin''
  if File.exists?(old_pid) && server.pid != old_pid
    begin
      Process.kill("QUIT", File.read(old_pid).to_i)
    rescue Errno::ENOENT, Errno::ESRCH
      # someone else did our job for us
    end
  end
end


after_fork do |server, worker|
  ##
  # Unicorn master loads the app then forks off workers - because of the way
  # Unix forking works, we need to make sure we aren''t using any of the
parent''s
  # sockets, e.g. db connection
  if defined?(ActiveRecord::Base)
    ActiveRecord::Base.establish_connection
  end
  # Redis and Memcached would go here but their connections are established
  # on demand, so the master never opens a socket
end

Eric Wong

2012-Jul-17 02:05 UTC

head link

Any signal other than -9 causes full CPU utilization by master unicorn process on FreeBSD

Mark Mccraw <Mark.Mccraw at sas.com> wrote:> Hi There!
> 
> I''m having a devil of a time figuring out a weird issue
I''m running
> into.  I have unicorn configured to start 4 worker processes, and that
> works great.  However, when it''s time to cycle the app, everything
> goes haywire. By trial and error, I have narrowed it down to this:
> sending any signal to the master process other than SIGKILL fails
> miserably.  No new master process is created, as described in the
> documentation, nothing happens to the existing workers, nothing gets
> written to any log, and if I run top -u, I can see that very quickly
> the master ramps up to 100% CPU utilization.  This happens if I run
> ''kill -HUP <master pid>'', ''kill -USR2
<master pid>'', even ''kill -QUIT
> <master pid>''.
This sounds like a Ruby/FreeBSD bug we''ve seen before.  My script
in http://mid.gmane.org/20120201181445.GA31624 at dcvr.yhbt.net should
reproduce the issue w/o unicorn.
> ruby 1.9.3p0 (2011-10-30 revision 33570) [amd64-freebsd9]
I think this is a Ruby bug that was fixed in 1.9.3-p30 according to
naruse:
http://mid.gmane.org/CAK6HhsppWVPijWLyZMwcKueYDT5sZroGv6ADXkgreht3aLfR9A at
mail.gmail.com

Since 1.9.3 p194 is the latest, can you try that out and confirm the
fix?  I don''t remember the other bug reported confirmed this issue was
fixed by upgrading Ruby.

Thanks.

Mark Mccraw

2012-Jul-17 11:56 UTC

head link

Any signal other than -9 causes full CPU utilization by master unicorn process on FreeBSD

On Jul 16, 2012, at 10:05 PM, Eric Wong wrote:
> Mark Mccraw <Mark.Mccraw at sas.com> wrote:
>> Hi There!
>> 
>> I''m having a devil of a time figuring out a weird issue
I''m running
>> into.  I have unicorn configured to start 4 worker processes, and that
>> works great.  However, when it''s time to cycle the app,
everything
>> goes haywire. By trial and error, I have narrowed it down to this:
>> sending any signal to the master process other than SIGKILL fails
>> miserably.  No new master process is created, as described in the
>> documentation, nothing happens to the existing workers, nothing gets
>> written to any log, and if I run top -u, I can see that very quickly
>> the master ramps up to 100% CPU utilization.  This happens if I run
>> ''kill -HUP <master pid>'', ''kill -USR2
<master pid>'', even ''kill -QUIT
>> <master pid>''.
> 
> This sounds like a Ruby/FreeBSD bug we''ve seen before.  My script
> in http://mid.gmane.org/20120201181445.GA31624 at dcvr.yhbt.net should
> reproduce the issue w/o unicorn.
You are absolutely correct!  Your script replicates the problem perfectly.
>> ruby 1.9.3p0 (2011-10-30 revision 33570) [amd64-freebsd9]
> 
> I think this is a Ruby bug that was fixed in 1.9.3-p30 according to
> naruse:
> http://mid.gmane.org/CAK6HhsppWVPijWLyZMwcKueYDT5sZroGv6ADXkgreht3aLfR9A at
mail.gmail.com
> 
> Since 1.9.3 p194 is the latest, can you try that out and confirm the
> fix?  I don''t remember the other bug reported confirmed this issue
was
> fixed by upgrading Ruby.
We''re upgrading now to see what happens.  I''m so glad you knew
about this.
There''s no telling how long it would have taken me to question the ruby
interpreter implementation, and
since it''s FreeBSD, I never would have found it by googling.
Thanks for hours (days?) of my life back.

Mark

Mark Mccraw

2012-Jul-17 21:23 UTC

head link

Any signal other than -9 causes full CPU utilization by master unicorn process on FreeBSD

On Jul 17, 2012, at 7:56 AM, Mark McCraw wrote:
> 
> On Jul 16, 2012, at 10:05 PM, Eric Wong wrote:
> 
>> Mark Mccraw <Mark.Mccraw at sas.com> wrote:
>>> Hi There!
>>> 
>>> I''m having a devil of a time figuring out a weird issue
I''m running
>>> into.  I have unicorn configured to start 4 worker processes, and
that
>>> works great.  However, when it''s time to cycle the app,
everything
>>> goes haywire. By trial and error, I have narrowed it down to this:
>>> sending any signal to the master process other than SIGKILL fails
>>> miserably.  No new master process is created, as described in the
>>> documentation, nothing happens to the existing workers, nothing
gets
>>> written to any log, and if I run top -u, I can see that very
quickly
>>> the master ramps up to 100% CPU utilization.  This happens if I run
>>> ''kill -HUP <master pid>'', ''kill
-USR2 <master pid>'', even ''kill -QUIT
>>> <master pid>''.
>> 
>> This sounds like a Ruby/FreeBSD bug we''ve seen before.  My
script
>> in http://mid.gmane.org/20120201181445.GA31624 at dcvr.yhbt.net should
>> reproduce the issue w/o unicorn.
> 
> You are absolutely correct!  Your script replicates the problem perfectly.
> 
>>> ruby 1.9.3p0 (2011-10-30 revision 33570) [amd64-freebsd9]
>> 
>> I think this is a Ruby bug that was fixed in 1.9.3-p30 according to
>> naruse:
>>
http://mid.gmane.org/CAK6HhsppWVPijWLyZMwcKueYDT5sZroGv6ADXkgreht3aLfR9A at
mail.gmail.com
>> 
>> Since 1.9.3 p194 is the latest, can you try that out and confirm the
>> fix?  I don''t remember the other bug reported confirmed this
issue was
>> fixed by upgrading Ruby.
> 
> We''re upgrading now to see what happens.  I''m so glad you
knew about this.
> There''s no telling how long it would have taken me to question the
ruby interpreter implementation, and
> since it''s FreeBSD, I never would have found it by googling.
> Thanks for hours (days?) of my life back.
> 
> Mark
> 
Just to follow up and close out the thread - Eric''s recollection was
spot on.
We upgraded ruby on our FreeBSD server to the latest thing, and the problem
completely disappeared.  Thanks again!

Eric Wong

2012-Jul-17 22:17 UTC

head link

Any signal other than -9 causes full CPU utilization by master unicorn process on FreeBSD

Mark Mccraw <Mark.Mccraw at sas.com> wrote:> On Jul 17, 2012, at 7:56 AM, Mark McCraw wrote:
> > We''re upgrading now to see what happens.  I''m so
glad you knew about
> > this.  There''s no telling how long it would have taken me to
> > question the ruby interpreter implementation, and since it''s
> > FreeBSD, I never would have found it by googling.  Thanks for hours
> > (days?) of my life back.
> 
> Just to follow up and close out the thread - Eric''s recollection
was
> spot on.  We upgraded ruby on our FreeBSD server to the latest thing,
> and the problem completely disappeared.  Thanks again!
Thanks for confirming this fix!

Fwiw, the Ruby core team probably uses/tests on GNU/Linux more than any
other platform.  Bugs on less common development platforms (especially
w.r.t tricky thread/fork/signal handling issues) may go unnoticed
elsewhere.  If you''re focused on using Ruby + *BSD in a production
system, I suggest testing/fixing/reporting issues against the Ruby
development branches as much as possible before they hit production :)

mongrel unicorn - Jul 2012 - Any signal other than -9 causes full CPU utilization by master unicorn process on FreeBSD

Any signal other than -9 causes full CPU utilization by master unicorn process on FreeBSD

Any signal other than -9 causes full CPU utilization by master unicorn process on FreeBSD

Any signal other than -9 causes full CPU utilization by master unicorn process on FreeBSD

Any signal other than -9 causes full CPU utilization by master unicorn process on FreeBSD

Any signal other than -9 causes full CPU utilization by master unicorn process on FreeBSD