Hi, I''ve realized that if I send USR2 very quicky to master process (each time to the new master process, of course) then sometimes Unicorn just dies totally and returns 0. Not sure, but I think that those times Unicorn receives the USR2 signal prior to load the code in which such signal is handled, could it be? Regards. -- I?aki Baz Castillo <ibc at aliax.net>
El Domingo, 17 de Enero de 2010, I?aki Baz Castillo escribi?:> Hi, I''ve realized that if I send USR2 very quicky to master process (each > time to the new master process, of course) then sometimes Unicorn just > dies totally and returns 0.Sorry, I''m not sure if it returns 0 or not in such case since I''m doing the check with a init script which performs the "reload" action by sending USR2 to master process (which runs in background).> Not sure, but I think that those times Unicorn receives the USR2 signal > prior to load the code in which such signal is handled, could it be?It makes sense since by default USR2 terminates a Ruby program. What about if Unicorn very quicky prepares the trap for USR2 so in case it receives it soon when starting it ignores it (and logs some warning)? Does it make sense? -- I?aki Baz Castillo <ibc at aliax.net>
El Domingo, 17 de Enero de 2010, I?aki Baz Castillo escribi?:> What about if Unicorn very quicky prepares the trap for USR2 so in case it > receives it soon when starting it ignores it (and logs some warning)? > Does it make sense?Adding the following on the top of bin/unicorn solves the problem: # Ignore USR2 (instead of terminating script) in case it arrives too soon. trap("USR2") do $stderr.puts "WARN: USR2 signal (reload action) received too soon, ignored" end -- I?aki Baz Castillo <ibc at aliax.net>
I?aki Baz Castillo <ibc at aliax.net> wrote:> El Domingo, 17 de Enero de 2010, I?aki Baz Castillo escribi?: > > What about if Unicorn very quicky prepares the trap for USR2 so in case it > > receives it soon when starting it ignores it (and logs some warning)? > > Does it make sense? > > Adding the following on the top of bin/unicorn solves the problem: > > # Ignore USR2 (instead of terminating script) in case it arrives too soon. > trap("USR2") do > $stderr.puts "WARN: USR2 signal (reload action) received too soon, ignored" > endNo it doesn''t, it just reduces the window where signals aren''t setup. On systems where Unicorn is installed as a RubyGem, that window of time may be much bigger. Is the issue with a script reading the pid file and sending the signal because it exists? In that case, it might be better to drop the pid file after setting up signal handlers, but I seem to recall some tools wanting the pid file earlier... I''ll get back to this on Tuesday. -- Eric Wong
El Lunes, 18 de Enero de 2010, Eric Wong escribi?:> I?aki Baz Castillo <ibc at aliax.net> wrote: > > El Domingo, 17 de Enero de 2010, I?aki Baz Castillo escribi?: > > > What about if Unicorn very quicky prepares the trap for USR2 so in case > > > it receives it soon when starting it ignores it (and logs some > > > warning)? Does it make sense? > > > > Adding the following on the top of bin/unicorn solves the problem: > > > > # Ignore USR2 (instead of terminating script) in case it arrives too > > soon. trap("USR2") do > > $stderr.puts "WARN: USR2 signal (reload action) received too soon, > > ignored" end > > No it doesn''t, it just reduces the window where signals aren''t setup. > On systems where Unicorn is installed as a RubyGem, that window of time > may be much bigger.But note that I''ve added: trap("USR2") do $stderr.puts "WARN: USR2 signal (reload action) received too soon, ignored" end at the beginning of the executable, so the time window is really reduced. Also note that in my modified version of the executable I load many other gems and them make the unicorn loading slower, increasing the time window.> Is the issue with a script reading the pid file and sending the signal > because it exists?I''ve done a init script to start/stop/reload/log-rotate Unicorn. The action "log-rotate" takes the PID from the pid file and sends a USR1 (this call is made by logrotate). The action "reload" sends a USR2 and later a TERM (depending on if the PID has changed or not). I just want to avoid that, in case a user invokes "reload" twice too fast, Unicorn receives USR2 before preparing it so it dies. With my "hack" is more difficult such possibility to occur. Thanks. -- I?aki Baz Castillo <ibc at aliax.net>
I?aki Baz Castillo <ibc at aliax.net> wrote:> El Lunes, 18 de Enero de 2010, Eric Wong escribi?: > > I?aki Baz Castillo <ibc at aliax.net> wrote: > > > El Domingo, 17 de Enero de 2010, I?aki Baz Castillo escribi?: > > > > What about if Unicorn very quicky prepares the trap for USR2 so in case > > > > it receives it soon when starting it ignores it (and logs some > > > > warning)? Does it make sense? > > > > > > Adding the following on the top of bin/unicorn solves the problem: > > > > > > # Ignore USR2 (instead of terminating script) in case it arrives too > > > soon. trap("USR2") do > > > $stderr.puts "WARN: USR2 signal (reload action) received too soon, > > > ignored" end > > > > No it doesn''t, it just reduces the window where signals aren''t setup. > > On systems where Unicorn is installed as a RubyGem, that window of time > > may be much bigger. > > But note that I''ve added: > > trap("USR2") do > $stderr.puts "WARN: USR2 signal (reload action) received too soon, > ignored" > end > > at the beginning of the executable, so the time window is really reduced. Also > note that in my modified version of the executable I load many other gems and > them make the unicorn loading slower, increasing the time window.On systems with RubyGems, the executable is always an automatically generated wrapper script that does "require ''rubygems''" first.> > Is the issue with a script reading the pid file and sending the signal > > because it exists? > > I''ve done a init script to start/stop/reload/log-rotate Unicorn. The action > "log-rotate" takes the PID from the pid file and sends a USR1 (this call is > made by logrotate). The action "reload" sends a USR2 and later a TERM > (depending on if the PID has changed or not). > > I just want to avoid that, in case a user invokes "reload" twice too fast, > Unicorn receives USR2 before preparing it so it dies. With my "hack" is more > difficult such possibility to occur.I''ll look into what the consequences of writing the pid file after signal handlers are setup later this week. Unfortunately pid files are inherently racy for the USR2 case. Perhaps your script can check the ctime of the pid file... -- Eric Wong
El Lunes, 18 de Enero de 2010, Eric Wong escribi?:> On systems with RubyGems, the executable is always an automatically > generated wrapper script that does "require ''rubygems''" first.True, but I use Ruby 1.9 which doesn''t require using "rubygems". However it''s of course true that the binary does a "gem_load_bin(''unicorn''..)" and so. But in my case, for now, I use directly the *real* executable rather than the executable that Gem puts in /usr/bin or /usr/local/bin.> > > Is the issue with a script reading the pid file and sending the signal > > > because it exists? > > > > I''ve done a init script to start/stop/reload/log-rotate Unicorn. The > > action "log-rotate" takes the PID from the pid file and sends a USR1 > > (this call is made by logrotate). The action "reload" sends a USR2 and > > later a TERM (depending on if the PID has changed or not). > > > > I just want to avoid that, in case a user invokes "reload" twice too > > fast, Unicorn receives USR2 before preparing it so it dies. With my > > "hack" is more difficult such possibility to occur. > > I''ll look into what the consequences of writing the pid file after > signal handlers are setup later this week.Hum, yes, that sound good.> Unfortunately pid files are inherently racy for the USR2 case. Perhaps > your script can check the ctime of the pid file...Good suggestion :) -- I?aki Baz Castillo <ibc at aliax.net>
Eric Wong
2010-Jan-20 02:38 UTC
[PATCH] initialize signal handlers before writing pid file
This prevents trigger-happy init scripts from reading the pid file (and thus sending signals) to a not-fully initialized master process to handle them. This does NOT fix anything if other processes are sending signals prematurely without relying on the presence of the pid file. It''s not possible to prevent all cases of this in one process, even in a purely C application, so we won''t bother trying. We continue to always defer signal handling to the main loop anyways, and signals sent to the master process will be deferred/ignored until Unicorn::HttpServer#join is run. --- Eric Wong <normalperson at yhbt.net> wrote: > I''ll look into what the consequences of writing the pid file after > signal handlers are setup later this week. I just pushed this out to git://git.bogomips.org/unicorn.git There were some test cases (that shouldn''t be called "unit tests") that needed to be fixed but I don''t think anything else is broken, but be sure to let us know either way. lib/unicorn.rb | 16 +++++++++++----- test/test_helper.rb | 5 +++++ test/unit/test_server.rb | 1 + test/unit/test_signals.rb | 6 +++++- test/unit/test_upload.rb | 1 + 5 files changed, 23 insertions(+), 6 deletions(-) diff --git a/lib/unicorn.rb b/lib/unicorn.rb index 7a1ef34..e3e4315 100644 --- a/lib/unicorn.rb +++ b/lib/unicorn.rb @@ -210,7 +210,18 @@ module Unicorn end config_listeners.each { |addr| listen(addr) } raise ArgumentError, "no listeners" if LISTENERS.empty? + + # this pipe is used to wake us up from select(2) in #join when signals + # are trapped. See trap_deferred. + init_self_pipe! + + # setup signal handlers before writing pid file in case people get + # trigger happy and send signals as soon as the pid file exists. + # Note that signals don''t actually get handled until the #join method + QUEUE_SIGS.each { |sig| trap_deferred(sig) } + trap(:CHLD) { |_| awaken_master } self.pid = config[:pid] + self.master_pid = $$ build_app! if preload_app maintain_worker_count @@ -322,14 +333,9 @@ module Unicorn # one-at-a-time time and we''ll happily drop signals in case somebody # is signalling us too often. def join - # this pipe is used to wake us up from select(2) in #join when signals - # are trapped. See trap_deferred - init_self_pipe! respawn = true last_check = Time.now - QUEUE_SIGS.each { |sig| trap_deferred(sig) } - trap(:CHLD) { |sig_nr| awaken_master } proc_name ''master'' logger.info "master process ready" # test_exec.rb relies on this message if ready_pipe diff --git a/test/test_helper.rb b/test/test_helper.rb index 3bdbeb1..5b750ee 100644 --- a/test/test_helper.rb +++ b/test/test_helper.rb @@ -294,3 +294,8 @@ def chunked_spawn(stdout, *cmd) end while true } end + +def reset_sig_handlers + sigs = %w(CHLD).concat(Unicorn::HttpServer::QUEUE_SIGS) + sigs.each { |sig| trap(sig, "DEFAULT") } +end diff --git a/test/unit/test_server.rb b/test/unit/test_server.rb index 00705d0..36333a0 100644 --- a/test/unit/test_server.rb +++ b/test/unit/test_server.rb @@ -41,6 +41,7 @@ class WebServerTest < Test::Unit::TestCase File.truncate("test_stderr.#$$.log", 0) @server.stop(true) end + reset_sig_handlers end def test_preload_app_config diff --git a/test/unit/test_signals.rb b/test/unit/test_signals.rb index eb2af0b..71cf8f4 100644 --- a/test/unit/test_signals.rb +++ b/test/unit/test_signals.rb @@ -40,6 +40,10 @@ class SignalsTest < Test::Unit::TestCase @server = nil end + def teardown + reset_sig_handlers + end + def test_worker_dies_on_dead_master pid = fork { app = lambda { |env| [ 200, {''X-Pid'' => "#$$" }, [] ] } @@ -190,7 +194,7 @@ class SignalsTest < Test::Unit::TestCase killer = fork { loop { Process.kill(:HUP, pid); sleep(0.0001) } } buf = '' '' * @bs @count.times { sock.syswrite(buf) } - Process.kill(:TERM, killer) + Process.kill(:KILL, killer) Process.waitpid2(killer) redirect_test_io { @server.stop(true) } # can''t check for == since pending signals get merged diff --git a/test/unit/test_upload.rb b/test/unit/test_upload.rb index 7ac3c9e..dc0eb40 100644 --- a/test/unit/test_upload.rb +++ b/test/unit/test_upload.rb @@ -55,6 +55,7 @@ class UploadTest < Test::Unit::TestCase def teardown redirect_test_io { @server.stop(true) } if @server @random.close + reset_sig_handlers end def test_put -- Eric Wong
Iñaki Baz Castillo
2010-Jan-20 08:45 UTC
[PATCH] initialize signal handlers before writing pid file
El Mi?rcoles, 20 de Enero de 2010, Eric Wong escribi?:> This prevents trigger-happy init scripts from reading the pid > file (and thus sending signals) to a not-fully initialized > master process to handle them. > > This does NOT fix anything if other processes are sending > signals prematurely without relying on the presence of the pid > file. It''s not possible to prevent all cases of this in one > process, even in a purely C application, so we won''t bother > trying. > > We continue to always defer signal handling to the main loop > anyways, and signals sent to the master process will be > deferred/ignored until Unicorn::HttpServer#join is run.Hi Eric, I''ve already tested it. It works great :) To check it I use the following init script "reload" action which hope could be useful for others (it requires loading "/lib/lsb/init-functions"): reload|force-reload|safe-reload) # Return # 0 if daemon has been reloaded # 1 if daemon was not running # 1 if daemon could not be reloaded # log_daemon_msg "Safe reloading $DESC" oldpid=$(pidofproc -p $PIDFILE) if [ "$oldpid" ] ; then kill -s USR2 $oldpid log_action_msg "signal USR2 sent to $NAME master process (pid $oldpid) to fork a new master and workers processes" log_action_msg "waiting for the new process to successfully start..." printf "." sleep 1 while [ "$(pidofproc -p $PIDFILE.oldbin)" ]; do printf "." sleep 1 done echo newpid=$(pidofproc -p $PIDFILE) # If the PID has definitively changed than the new master is running. if [ "$oldpid" != "$newpid" ] ; then log_success_msg "server correctly reloaded, new master process has pid $newpid" log_end_msg 0 # If not, the new master process couldn''t start due to the new # configuration or code error. else log_failure_msg "error running a new master process, $NAME was not reloaded (check error log)" log_end_msg 1 fi else log_failure_msg "$NAME is not running" log_end_msg 1 fi ;; And I test the following loop to check it: while [ 0 ] ; do /etc/init.d/myserver reload ; done -- I?aki Baz Castillo <ibc at aliax.net>