Hi everyone, While writing a script to determine the success or failure of a Unicorn reload attempt (without having to parse a log), I noticed that Unicorn doesn''t preserve the timestamp of its pid file. In other words, instead of renaming pidfile to pidfile.oldbin (and then back again if the reload failed), it creates a new pid file for each master phase change. This means we cannot simply compare the mtime of the current pidfile against the time the USR2 signal was given in order to make a reasonable conclusion. I tried another method, which was to look at the start time of the process as reported by ps(1), but on Linux, that time does not come from the wall clock: it''s derived from the number of jiffies since system boot. So it''s not guaranteed to be accurate, especially if the wall clock was incorrect at system boot. Are there any other methods anyone can suggest? Otherwise, a change to Unicorn''s behavior with respect to pid file maintenance would be kindly appreciated. Best regards, --Michael _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying
Michael Fischer <mfischer@zendesk.com> wrote:> Hi everyone, > > While writing a script to determine the success or failure of a > Unicorn reload attempt (without having to parse a log), I noticed that > Unicorn doesn''t preserve the timestamp of its pid file. In other > words, instead of renaming pidfile to pidfile.oldbin (and then back > again if the reload failed), it creates a new pid file for each master > phase change. > > This means we cannot simply compare the mtime of the current pidfile > against the time the USR2 signal was given in order to make a > reasonable conclusion. > > I tried another method, which was to look at the start time of the > process as reported by ps(1), but on Linux, that time does not come > from the wall clock: it''s derived from the number of jiffies since > system boot. So it''s not guaranteed to be accurate, especially if the > wall clock was incorrect at system boot. > > Are there any other methods anyone can suggest? Otherwise, a change > to Unicorn''s behavior with respect to pid file maintenance would be > kindly appreciated.I read and stash the value of the pid file before issuing any USR2. Later, you can issue "kill -0 $old_pid" after sending SIGQUIT to ensure it''s dead. Checking the mtime of the pidfile is really bizarre... OTOH, there''s times when users accidentally remove a pid file and regenerate by hand it from ps(1), too... _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying
On Wed, Oct 23, 2013 at 5:53 PM, Eric Wong <normalperson@yhbt.net> wrote:> I read and stash the value of the pid file before issuing any USR2. > Later, you can issue "kill -0 $old_pid" after sending SIGQUIT > to ensure it''s dead.That''s inherently racy; another process can claim the old PID in the interim.> Checking the mtime of the pidfile is really bizarre...Perhaps (though it''s a normative criticism), but on the other hand, it isn''t subject to the race above.> OTOH, there''s times when users accidentally remove a pid > file and regenerate by hand it from ps(1), too...Sure, but (a) that''s a corner case I''m not particularly concerned about, and (b) it wouldn''t cause any problems, assuming the user did this before any reload attempt, and not in the middle or something. --Michael _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying
Michael Fischer <mfischer@zendesk.com> wrote:> On Wed, Oct 23, 2013 at 5:53 PM, Eric Wong <normalperson@yhbt.net> wrote: > > > I read and stash the value of the pid file before issuing any USR2. > > Later, you can issue "kill -0 $old_pid" after sending SIGQUIT > > to ensure it''s dead. > > That''s inherently racy; another process can claim the old PID in the interim.Right, but raciness goes for anything regarding pid files. The OS does make an effort to avoid recycling PIDs too often, and going through all the PIDs in a system quickly is probably rare. I haven''t hit it, at least.> > Checking the mtime of the pidfile is really bizarre... > > Perhaps (though it''s a normative criticism), but on the other hand, it > isn''t subject to the race above.It''s still racy in a different way, though (file could change right after checking).> > OTOH, there''s times when users accidentally remove a pid > > file and regenerate by hand it from ps(1), too... > > Sure, but (a) that''s a corner case I''m not particularly concerned > about, and (b) it wouldn''t cause any problems, assuming the user did > this before any reload attempt, and not in the middle or something.Having the process start time in /proc be unreliable because the server has the wrong time is also in the same category of corner cases. Also, can you check the inode of the /proc/$pid entry? Perhaps PID files are horrible, really :< _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying
On Wed, Oct 23, 2013 at 7:03 PM, Eric Wong <normalperson@yhbt.net> wrote:>> > I read and stash the value of the pid file before issuing any USR2. >> > Later, you can issue "kill -0 $old_pid" after sending SIGQUIT >> > to ensure it''s dead. >> >> That''s inherently racy; another process can claim the old PID in the interim. > > Right, but raciness goes for anything regarding pid files. > > The OS does make an effort to avoid recycling PIDs too often, > and going through all the PIDs in a system quickly is > probably rare. I haven''t hit it, at least.That''s not good enough. The fact that the pid file contains a pid is immaterial to me; I don''t even need to look at it. I only care about when it was created, or what its inode number is, so that I can detect whether Unicorn was last successfully started or restarted. rename(2) is atomic per POSIX and is not subject to race conditions.>> > Checking the mtime of the pidfile is really bizarre... >> >> Perhaps (though it''s a normative criticism), but on the other hand, it >> isn''t subject to the race above. > > It''s still racy in a different way, though (file could change right > after checking).If the file''s mtime or inode number changes under my proposal, that means the reload must have been successful. What race condition are you referring to that would render this conclusion inaccurate?> Having the process start time in /proc be unreliable because the server > has the wrong time is also in the same category of corner cases.This is absolutely not true. A significant minority, if not a majority, of servers will have at least slightly inaccurate wall clocks on boot. This is usually corrected during boot by an NTP sync, but by then the die has already been cast insofar as ps(1) output is concerned.> Also, can you check the inode of the /proc/$pid entry? PerhapsThat''s not portable.> PID files are horrible, really :<To reiterate, I''m not using the PID file in this instance to determine Unicorn''s PID. It could be empty, for all I care. --Michael _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying
Michael Fischer <mfischer@zendesk.com> wrote:> On Wed, Oct 23, 2013 at 7:03 PM, Eric Wong <normalperson@yhbt.net> wrote: > > >> > I read and stash the value of the pid file before issuing any USR2. > >> > Later, you can issue "kill -0 $old_pid" after sending SIGQUIT > >> > to ensure it''s dead. > >> > >> That''s inherently racy; another process can claim the old PID in the interim. > > > > Right, but raciness goes for anything regarding pid files. > > > > The OS does make an effort to avoid recycling PIDs too often, > > and going through all the PIDs in a system quickly is > > probably rare. I haven''t hit it, at least. > > That''s not good enough. > > The fact that the pid file contains a pid is immaterial to me; I don''t > even need to look at it. I only care about when it was created, or > what its inode number is, so that I can detect whether Unicorn was > last successfully started or restarted. rename(2) is atomic per POSIX > and is not subject to race conditions.Right, we looked at using rename last year but I didn''t think it''s possible given we need to write the pid file before binding new listen sockets http://mid.gmane.org/20121127215146.GA23452@dcvr.yhbt.net But perhaps we can drop the pid file late iff ENV["UNICORN_FD"] is detected. I''ll see if that can be done w/o breaking compatibility.> >> > Checking the mtime of the pidfile is really bizarre... > >> > >> Perhaps (though it''s a normative criticism), but on the other hand, it > >> isn''t subject to the race above. > > > > It''s still racy in a different way, though (file could change right > > after checking). > > If the file''s mtime or inode number changes under my proposal, that > means the reload must have been successful. What race condition are > you referring to that would render this conclusion inaccurate?It doesn''t mean the process didn''t exit/crash right after writing the PID.> > Having the process start time in /proc be unreliable because the server > > has the wrong time is also in the same category of corner cases. > > This is absolutely not true. A significant minority, if not a > majority, of servers will have at least slightly inaccurate wall > clocks on boot. This is usually corrected during boot by an NTP sync, > but by then the die has already been cast insofar as ps(1) output is > concerned.But NTP syncs early in the boot process before most processes (including unicorn) are started. It shouldn''t matter, then, right?> > Also, can you check the inode of the /proc/$pid entry? Perhaps > > That''s not portable. > > > PID files are horrible, really :< > > To reiterate, I''m not using the PID file in this instance to determine > Unicorn''s PID. It could be empty, for all I care.OK. I assume you do the same for nginx? _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying
On Thu, Oct 24, 2013 at 11:21 AM, Eric Wong <normalperson@yhbt.net> wrote:> Right, we looked at using rename last year but I didn''t think it''s possible > given we need to write the pid file before binding new listen sockets > > http://mid.gmane.org/20121127215146.GA23452@dcvr.yhbt.net > > But perhaps we can drop the pid file late iff ENV["UNICORN_FD"] is > detected. I''ll see if that can be done w/o breaking compatibility.My opinion is that supporting backward compatibility cases that are clearly poorly designed, at least in open-source software, is ill-advised. (I''m referring to the Mongrel compatibility semantics discussed in that article.) That aside, I don''t yet understand this "need" you''re referring to. The control flow I''m proposing is as follows: (1) Previous-generation parent (P) receives SIGUSR2. (2) P renames unicorn.pid to unicorn.oldpid (3) P forks child (P''); if fork unsuccessful, P renames unicorn.oldpid to unicorn.pid. (4) P'' calls exec and attempts to start; creates unicorn.pid. P watches for SIGCHLD from P''. If received, P renames unicorn.oldpid to unicorn.pid. (5) P'' sends SIGQUIT to P. P'' unlinks unicorn.oldpid. P'' is now P. What am I missing here? This is, to my knowledge, precisely what nginx does (http://wiki.nginx.org/CommandLine#Upgrading_To_a_New_Binary_On_The_Fly).>> If the file''s mtime or inode number changes under my proposal, that >> means the reload must have been successful. What race condition are >> you referring to that would render this conclusion inaccurate? > > It doesn''t mean the process didn''t exit/crash right after writing the PID.That should not happen per (4) above.> But NTP syncs early in the boot process before most processes (including > unicorn) are started. It shouldn''t matter, then, right?Truth be told, I''m not completely certain why this is an issue. My reading of procps and the kernel suggests it should be doing the right thing, but I tried this at first: - Touch a timestamp file before sending P a SIGUSR2. - Wait for oldpid to disappear - Read the stime field from ps(1) for the remaining master process (P or P'') - If stime < mtime of timestamp: new process failed. If stime > mtime, new process succeeded. But for reasons unclear to me, sometimes the stime of P'' (successful reload) would predate the timestamp! This was obviously agonizing.>> To reiterate, I''m not using the PID file in this instance to determine >> Unicorn''s PID. It could be empty, for all I care. > > OK. I assume you do the same for nginx?With nginx we have -t; we can at least test the config file and have a reasonable degree of certainty that it will reload properly. With Rack apps, not so much. :) --Michael _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying
Michael Fischer <mfischer@zendesk.com> wrote:> On Thu, Oct 24, 2013 at 11:21 AM, Eric Wong <normalperson@yhbt.net> wrote: > > > Right, we looked at using rename last year but I didn''t think it''s possible > > given we need to write the pid file before binding new listen sockets > > > > http://mid.gmane.org/20121127215146.GA23452@dcvr.yhbt.net > > > > But perhaps we can drop the pid file late iff ENV["UNICORN_FD"] is > > detected. I''ll see if that can be done w/o breaking compatibility. > > My opinion is that supporting backward compatibility cases that are > clearly poorly designed, at least in open-source software, is > ill-advised. (I''m referring to the Mongrel compatibility semantics > discussed in that article.) > > That aside, I don''t yet understand this "need" you''re referring to. > The control flow I''m proposing is as follows:I''m not really sure, either; I just remember it was somewhat important to Mongrel back then. I''ll get back to this later today/tomorrow. Your control flow looks correct, though.> > But NTP syncs early in the boot process before most processes (including > > unicorn) are started. It shouldn''t matter, then, right? > > Truth be told, I''m not completely certain why this is an issue. My > reading of procps and the kernel suggests it should be doing the right > thing, but I tried this at first: > > - Touch a timestamp file before sending P a SIGUSR2. > - Wait for oldpid to disappear > - Read the stime field from ps(1) for the remaining master process (P or P'') > - If stime < mtime of timestamp: new process failed. If stime > > mtime, new process succeeded. > > But for reasons unclear to me, sometimes the stime of P'' (successful > reload) would predate the timestamp! This was obviously agonizing.OK, comparing mtime vs calculated value of stime is not possible because of time adjustments. Process start time is stored as monotonic time, and calculated in ps(1) to real clock time. So you can only compare stimes between different processes. Comparing stime to the mtime/ctime/atime of any file will not work reliably. _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying
Michael Fischer <mfischer@zendesk.com> wrote:> (1) Previous-generation parent (P) receives SIGUSR2. > (2) P renames unicorn.pid to unicorn.oldpid > (3) P forks child (P''); if fork unsuccessful, P renames unicorn.oldpid > to unicorn.pid. > (4) P'' calls exec and attempts to start; creates unicorn.pid. P > watches for SIGCHLD from P''. If received, P renames unicorn.oldpid to > unicorn.pid. > (5) P'' sends SIGQUIT to P. P'' unlinks unicorn.oldpid. P'' is now P. > > What am I missing here? This is, to my knowledge, precisely what > nginx does (http://wiki.nginx.org/CommandLine#Upgrading_To_a_New_Binary_On_The_Fly).OK, this is probably safe and do what you want. It''s sitting in master for now: ------------------------------ 8< -------------------------------- From: Eric Wong <e@80x24.org> Subject: [PATCH] attempt to rename PID file when possible This will preserve mtime on successful renames for comparisions. While we''re at it, avoid writing the new PID until the listeners are inherited successfully. This can be useful to avoid accidentally clobbering a good PID if binding the listener or building the app (preload_app==true) fails --- lib/unicorn/http_server.rb | 48 +++++++++++++++++++++++++++++++++++----------- 1 file changed, 37 insertions(+), 11 deletions(-) diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb index bed24d0..cd160c5 100644 --- a/lib/unicorn/http_server.rb +++ b/lib/unicorn/http_server.rb @@ -134,11 +134,22 @@ class Unicorn::HttpServer # Note that signals don''t actually get handled until the #join method QUEUE_SIGS.each { |sig| trap(sig) { SIG_QUEUE << sig; awaken_master } } trap(:CHLD) { awaken_master } - self.pid = config[:pid] + + # write pid early for Mongrel compatibility if we''re not inheriting sockets + # This was needed for compatibility with some health checker a long time + # ago. This unfortunately has the side effect of clobbering valid PID + # files. + self.pid = config[:pid] unless ENV["UNICORN_FD"] self.master_pid = $$ build_app! if preload_app bind_new_listeners! + + # Assuming preload_app==false, we drop the pid file after the app is ready + # to process requests. If binding or build_app! fails with + # preload_app==true, we''ll never get here and the parent will recover + self.pid = config[:pid] if ENV["UNICORN_FD"] + spawn_missing_workers self end @@ -180,6 +191,21 @@ class Unicorn::HttpServer Unicorn::HttpRequest::DEFAULTS["rack.logger"] = @logger = obj end + def clobber_pid(path) + unlink_pid_safe(@pid) if @pid + if path + fp = begin + tmp = "#{File.dirname(path)}/#{rand}.#$$" + File.open(tmp, File::RDWR|File::CREAT|File::EXCL, 0644) + rescue Errno::EEXIST + retry + end + fp.syswrite("#$$\n") + File.rename(fp.path, path) + fp.close + end + end + # sets the path for the PID file of the master process def pid=(path) if path @@ -194,18 +220,18 @@ class Unicorn::HttpServer "(or pid=#{path} is stale)" end end - unlink_pid_safe(pid) if pid - if path - fp = begin - tmp = "#{File.dirname(path)}/#{rand}.#$$" - File.open(tmp, File::RDWR|File::CREAT|File::EXCL, 0644) - rescue Errno::EEXIST - retry + # rename the old pid if posible + if @pid && path + begin + File.rename(@pid, path) + rescue Errno::ENOENT, Errno::EXDEV + # a user may have accidentally removed the original. + # Obviously cross-FS renames + clobber_pid(path) end - fp.syswrite("#$$\n") - File.rename(fp.path, path) - fp.close + else + clobber_pid(path) end @pid = path end -- 1.8.4.483.g7fe67e6.dirty _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying
On Fri, Oct 25, 2013 at 12:58 AM, Eric Wong <normalperson@yhbt.net> wrote:> + # rename the old pid if posibleNitpick: you have a typo here. :) -- Phusion | Ruby & Rails deployment, scaling and tuning solutions Web: http://www.phusion.nl/ E-mail: info@phusion.nl Chamber of commerce no: 08173483 (The Netherlands) _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying
Hongli Lai <hongli@phusion.nl> wrote:> On Fri, Oct 25, 2013 at 12:58 AM, Eric Wong <normalperson@yhbt.net> wrote: > > + # rename the old pid if posible > > Nitpick: you have a typo here. :)Thanks, will push :) Subject: [PATCH] http_server: fixup comments for PID file renaming Thanks to Hongli Lai for noticing my typo. While we''re at it, finish up a halfway-written comment for the EXDEV case --- lib/unicorn/http_server.rb | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb index a6266ea..2decd77 100644 --- a/lib/unicorn/http_server.rb +++ b/lib/unicorn/http_server.rb @@ -221,13 +221,13 @@ class Unicorn::HttpServer end end - # rename the old pid if posible + # rename the old pid if possible if @pid && path begin File.rename(@pid, path) rescue Errno::ENOENT, Errno::EXDEV - # a user may have accidentally removed the original. - # Obviously cross-FS renames + # a user may have accidentally removed the original, + # obviously cross-FS renames don''t work, either. clobber_pid(path) end else -- 1.8.4.483.g7fe67e6.dirty _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying