thr3ads.net - Backgroundrb devel - [Backgroundrb-devel] trouble stopping backgroundrb [Sep 2008]

If this information is useful, please help other people find it:
Share via:

Ryan Case

2008-Sep-16 16:11 UTC

[Backgroundrb-devel] trouble stopping backgroundrb

Hi folks -

I''m having trouble getting backgroundrb to stop after one of the
packet_worker_r processes dies.

If backgroundrb is running properly,
"/path/to/application/script/backgroundrb stop" works fine, but often
one of the packet_worker_r processes dies, and the stop command no
longer works after that (it runs, but it does not stop the processes,
and so then start doesn''t work).

The only thing that seems to work at that point is to manually kill
the processes that are still running, and then the start works, but
that is going to make restarting via monit a lot less clean.

Any ideas would be much appreciated!

I''m using github version of backgroundrb, and packet 0.1.13 running on
ubuntu.

Thanks!
Ryan

Jonathan Wallace

2008-Sep-17 15:42 UTC

head link

[Backgroundrb-devel] trouble stopping backgroundrb

Hi Ryan,

I recently ran into the same issue where the backgroundrb process
would not respond to ./script/backgroundrb stop command.  The pid file
was being deleted but the actual process was not being killed.  I''m
running packet 0.1.12 on gentoo.

I''m not exactly sure what conditions put backgroundrb into such a
state but I''ve decided to modify the script/backgroundrb to behave a
little differently.

My hypothesis is that if one of the Process.kill method calls in
script/backgroundrb raises an exception, the pid file is deleted even
though the kill signal is never sent.  At this point, running starting
and stopping backgroundrb never affects the original still running
backgroundrb process.

There are a couple of reasons that I believe an exception could be
raised.  Either the Process.getpgid(pid), Process.kill(''TERM'',
pid) or
the PRocess.kill(''-TERM'', pgid) raise an exception or the
effective
uid of the user running script/backgroundrb stop does not have
permission to kill those processes.

To fix this, we''ve removed the Process.getpgid and the two
Process.kill''s that are sending the TERM signal.  Since we''ve
architected our backgroundrb jobs to be persistent and idempotent (a
db backed queue written before the feature appeared in bdrb), we''ll
just use the KILL signal.

Thoughts?

Thanks,
 Jonathan

On Tue, Sep 16, 2008 at 12:11 PM, Ryan Case <mrryancase at gmail.com>
wrote:> Hi folks -
>
> I''m having trouble getting backgroundrb to stop after one of the
> packet_worker_r processes dies.
>
> If backgroundrb is running properly,
> "/path/to/application/script/backgroundrb stop" works fine, but
often
> one of the packet_worker_r processes dies, and the stop command no
> longer works after that (it runs, but it does not stop the processes,
> and so then start doesn''t work).
>
> The only thing that seems to work at that point is to manually kill
> the processes that are still running, and then the start works, but
> that is going to make restarting via monit a lot less clean.
>
> Any ideas would be much appreciated!
>
> I''m using github version of backgroundrb, and packet 0.1.13
running on ubuntu.
>
> Thanks!
> Ryan
> _______________________________________________
> Backgroundrb-devel mailing list
> Backgroundrb-devel at rubyforge.org
> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>

John O''Shea

2008-Sep-17 16:35 UTC

head link

[Backgroundrb-devel] trouble stopping backgroundrb

Jonathan,
    Glad you raised this, I''ve been spending some time trying to 
diagnose this exact same problem. 
    The exception handling code in the "when ''stop''"
block (in
script/backgroundrb) could definitely could be improved somewhat
- check that the process with ''pid'' exists before trying to
kill it
- rescue permission exceptions (Errno::EPERM)
- only delete the pid file if the process pid does not still exist (in 
ensure block)
- be a little more verbose to stdout/stderr

While we are on the subject of shutdown, - when the backgroundrb process 
gets a HUP signal does it wait for existing workers to complete any work 
methods that are executing or is the
''Process.kill(''-TERM'', pgid)'' call
intended to make the OS handle this? 

We use capistrano to deploy our application (stopping and restarting 
backgroundrb after the rails app has been updated).  It would be great 
if we could have more predictability regarding shutting down 
backgroundrb (i.e. have the backgroundrb disable the reactor loop in 
idle workers and wait for all active workers to finish methods, then 
shutdown").

John.

Jonathan Wallace wrote:> Hi Ryan,
>
> I recently ran into the same issue where the backgroundrb process
> would not respond to ./script/backgroundrb stop command.  The pid file
> was being deleted but the actual process was not being killed. 
I''m
> running packet 0.1.12 on gentoo.
>
> I''m not exactly sure what conditions put backgroundrb into such a
> state but I''ve decided to modify the script/backgroundrb to behave
a
> little differently.
>
> My hypothesis is that if one of the Process.kill method calls in
> script/backgroundrb raises an exception, the pid file is deleted even
> though the kill signal is never sent.  At this point, running starting
> and stopping backgroundrb never affects the original still running
> backgroundrb process.
>
> There are a couple of reasons that I believe an exception could be
> raised.  Either the Process.getpgid(pid),
Process.kill(''TERM'', pid) or
> the PRocess.kill(''-TERM'', pgid) raise an exception or the
effective
> uid of the user running script/backgroundrb stop does not have
> permission to kill those processes.
>
> To fix this, we''ve removed the Process.getpgid and the two
> Process.kill''s that are sending the TERM signal.  Since
we''ve
> architected our backgroundrb jobs to be persistent and idempotent (a
> db backed queue written before the feature appeared in bdrb),
we''ll
> just use the KILL signal.
>
> Thoughts?
>
> Thanks,
>  Jonathan
>
> On Tue, Sep 16, 2008 at 12:11 PM, Ryan Case <mrryancase at gmail.com>
wrote:
>   
>> Hi folks -
>>
>> I''m having trouble getting backgroundrb to stop after one of
the
>> packet_worker_r processes dies.
>>
>> If backgroundrb is running properly,
>> "/path/to/application/script/backgroundrb stop" works fine,
but often
>> one of the packet_worker_r processes dies, and the stop command no
>> longer works after that (it runs, but it does not stop the processes,
>> and so then start doesn''t work).
>>
>> The only thing that seems to work at that point is to manually kill
>> the processes that are still running, and then the start works, but
>> that is going to make restarting via monit a lot less clean.
>>
>> Any ideas would be much appreciated!
>>
>> I''m using github version of backgroundrb, and packet 0.1.13
running on ubuntu.
>>
>> Thanks!
>> Ryan
>> _______________________________________________
>> Backgroundrb-devel mailing list
>> Backgroundrb-devel at rubyforge.org
>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>>
>>     
> _______________________________________________
> Backgroundrb-devel mailing list
> Backgroundrb-devel at rubyforge.org
> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>   

-- 
John O''Shea, CTO at Nooked
www: http://www.nooked.com/
cell: +353 87 992 9959
skype: joshea

hemant kumar

2008-Sep-17 17:06 UTC

head link

[Backgroundrb-devel] trouble stopping backgroundrb

Hi Jonathan & Ryan,

Problem is cross platform issues, just sending "KILL" won''t
work across
all the platforms.Also we are sending a Process.kill("-TERM",pgid),
because, this will ensure that corresponding workers also get killed.

What was the exception, you got when this happened, that can shed some
light on the issue.

On my ubuntu box, 
On Wed, 2008-09-17 at 11:42 -0400, Jonathan Wallace
wrote:> Hi Ryan,
> 
> I recently ran into the same issue where the backgroundrb process
> would not respond to ./script/backgroundrb stop command.  The pid file
> was being deleted but the actual process was not being killed. 
I''m
> running packet 0.1.12 on gentoo.
> 
> I''m not exactly sure what conditions put backgroundrb into such a
> state but I''ve decided to modify the script/backgroundrb to behave
a
> little differently.
> 
> My hypothesis is that if one of the Process.kill method calls in
> script/backgroundrb raises an exception, the pid file is deleted even
> though the kill signal is never sent.  At this point, running starting
> and stopping backgroundrb never affects the original still running
> backgroundrb process.
> 
> There are a couple of reasons that I believe an exception could be
> raised.  Either the Process.getpgid(pid),
Process.kill(''TERM'', pid) or
> the PRocess.kill(''-TERM'', pgid) raise an exception or the
effective
> uid of the user running script/backgroundrb stop does not have
> permission to kill those processes.
> 
> To fix this, we''ve removed the Process.getpgid and the two
> Process.kill''s that are sending the TERM signal.  Since
we''ve
> architected our backgroundrb jobs to be persistent and idempotent (a
> db backed queue written before the feature appeared in bdrb),
we''ll
> just use the KILL signal.
> 
> Thoughts?
> 
> Thanks,
>  Jonathan
> 
> On Tue, Sep 16, 2008 at 12:11 PM, Ryan Case <mrryancase at gmail.com>
wrote:
> > Hi folks -
> >
> > I''m having trouble getting backgroundrb to stop after one of
the
> > packet_worker_r processes dies.
> >
> > If backgroundrb is running properly,
> > "/path/to/application/script/backgroundrb stop" works fine,
but often
> > one of the packet_worker_r processes dies, and the stop command no
> > longer works after that (it runs, but it does not stop the processes,
> > and so then start doesn''t work).
> >
> > The only thing that seems to work at that point is to manually kill
> > the processes that are still running, and then the start works, but
> > that is going to make restarting via monit a lot less clean.
> >
> > Any ideas would be much appreciated!
> >
> > I''m using github version of backgroundrb, and packet 0.1.13
running on ubuntu.
> >
> > Thanks!
> > Ryan
> > _______________________________________________
> > Backgroundrb-devel mailing list
> > Backgroundrb-devel at rubyforge.org
> > http://rubyforge.org/mailman/listinfo/backgroundrb-devel
> >
> _______________________________________________
> Backgroundrb-devel mailing list
> Backgroundrb-devel at rubyforge.org
> http://rubyforge.org/mailman/listinfo/backgroundrb-devel

Woody Peterson

2008-Sep-17 20:14 UTC

head link

[Backgroundrb-devel] trouble stopping backgroundrb

I too have been having the same issue. Every time I try to restart  
backgroundrb after an update to our application (about once a day), I  
have to forcefully kill it myself. However, I haven''t been able to  
reproduce it in a controlled setting. After I kill and start it, it  
all works ok. I tried killing packet_worker processes (even with -9),  
but it still shuts down correctly on the stop command. I''ll let it run
for a while and try tomorrow, but has anyone been able to predictably  
reproduce the issue?

-Woody

(debian etch, packet 0.1.10)

On Sep 16, 2008, at 9:11 AM, Ryan Case wrote:
> Hi folks -
>
> I''m having trouble getting backgroundrb to stop after one of the
> packet_worker_r processes dies.
>
> If backgroundrb is running properly,
> "/path/to/application/script/backgroundrb stop" works fine, but
often
> one of the packet_worker_r processes dies, and the stop command no
> longer works after that (it runs, but it does not stop the processes,
> and so then start doesn''t work).
>
> The only thing that seems to work at that point is to manually kill
> the processes that are still running, and then the start works, but
> that is going to make restarting via monit a lot less clean.
>
> Any ideas would be much appreciated!
>
> I''m using github version of backgroundrb, and packet 0.1.13
running
> on ubuntu.
>
> Thanks!
> Ryan
> _______________________________________________
> Backgroundrb-devel mailing list
> Backgroundrb-devel at rubyforge.org
> http://rubyforge.org/mailman/listinfo/backgroundrb-devel

Ryan Case

2008-Sep-17 21:48 UTC

head link

[Backgroundrb-devel] trouble stopping backgroundrb

I''m not able to reproduce the issue consistently. Often killing (-9) a
packet_worker will create the issue, but not always.

Jonathan  - thanks for the info! skipping the pgid sounds interesting,
since what I do to manually fix it is usually just kill the pid for
the backgroundrb process (however, as Hemant mentioned, I guess that
might leave workers still running). And it definitely seems like it is
failing on the kill block and deleting the pid file even tho the
process is still running.

Hemant - I don''t see too much as far as exceptions when I run into
this issue. The debug log does show the "Address already in use -
bind(2) (Errno::EADDRINUSE)" error when start tries to run, but I
don''t see exceptions for the stop in the log.

When I run stop however, I do get the "Deleting pid file" output,
which looks like Errno::ESRCH is being rescued. (Not sure if that is
correct, or if there is a way to see more detail on the exception...)

Thanks everyone,
Ryan

On Wed, Sep 17, 2008 at 4:14 PM, Woody Peterson
<woody at crystalcommerce.com> wrote:> I too have been having the same issue. Every time I try to restart
> backgroundrb after an update to our application (about once a day), I have
> to forcefully kill it myself. However, I haven''t been able to
reproduce it
> in a controlled setting. After I kill and start it, it all works ok. I
tried
> killing packet_worker processes (even with -9), but it still shuts down
> correctly on the stop command. I''ll let it run for a while and try
tomorrow,
> but has anyone been able to predictably reproduce the issue?
>
> -Woody
>
> (debian etch, packet 0.1.10)
>
> On Sep 16, 2008, at 9:11 AM, Ryan Case wrote:
>
>> Hi folks -
>>
>> I''m having trouble getting backgroundrb to stop after one of
the
>> packet_worker_r processes dies.
>>
>> If backgroundrb is running properly,
>> "/path/to/application/script/backgroundrb stop" works fine,
but often
>> one of the packet_worker_r processes dies, and the stop command no
>> longer works after that (it runs, but it does not stop the processes,
>> and so then start doesn''t work).
>>
>> The only thing that seems to work at that point is to manually kill
>> the processes that are still running, and then the start works, but
>> that is going to make restarting via monit a lot less clean.
>>
>> Any ideas would be much appreciated!
>>
>> I''m using github version of backgroundrb, and packet 0.1.13
running on
>> ubuntu.
>>
>> Thanks!
>> Ryan
>> _______________________________________________
>> Backgroundrb-devel mailing list
>> Backgroundrb-devel at rubyforge.org
>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>
> _______________________________________________
> Backgroundrb-devel mailing list
> Backgroundrb-devel at rubyforge.org
> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>

hemant kumar

2008-Sep-17 22:08 UTC

head link

[Backgroundrb-devel] trouble stopping backgroundrb

Okay folks here is a patch to "backgroundrb" script, which should fix
some issues:

diff --git a/script/backgroundrb b/script/backgroundrb
index dabf80b..8d4bb78 100755
--- a/script/backgroundrb
+++ b/script/backgroundrb
@@ -49,18 +49,9 @@ when ''stop''
   def kill_process arg_pid_file
     pid = nil
     File.open(arg_pid_file, "r") { |pid_handle| pid
pid_handle.gets.strip.chomp.to_i }
-    begin
-      pgid =  Process.getpgid(pid)
-      Process.kill(''TERM'', pid)
-      Process.kill(''-TERM'', pgid)
-      Process.kill(''KILL'', pid)
-    rescue Errno::ESRCH => e
-      puts "Deleting pid file"
-    rescue
-      puts $!
-    ensure
-      File.delete(arg_pid_file) if File.exists?(arg_pid_file)
-    end
+    pgid =  Process.getpgid(pid)
+    Process.kill(''-TERM'', pgid)
+    File.delete(arg_pid_file) if File.exists?(arg_pid_file)
   end
   pid_files = Dir["#{RAILS_HOME}/tmp/pids/backgroundrb_*.pid"]
   pid_files.each { |x| kill_process(x) }

What it does is:
1. Deleting by group id is enough for master process. 
2. Do not delete the pid file if, there was an exception while stopping
the daemon.
3. Do not handle exceptions silently.

Please try this and let me know, how it goes.



On Wed, 2008-09-17 at 17:35 +0100, John O''Shea
wrote:> Jonathan,
>     Glad you raised this, I''ve been spending some time trying to 
> diagnose this exact same problem. 
>     The exception handling code in the "when
''stop''" block (in
> script/backgroundrb) could definitely could be improved somewhat
> - check that the process with ''pid'' exists before trying
to kill it
> - rescue permission exceptions (Errno::EPERM)
> - only delete the pid file if the process pid does not still exist (in 
> ensure block)
> - be a little more verbose to stdout/stderr
> 
> While we are on the subject of shutdown, - when the backgroundrb process 
> gets a HUP signal does it wait for existing workers to complete any work 
> methods that are executing or is the
''Process.kill(''-TERM'', pgid)'' call
> intended to make the OS handle this? 
> 
> We use capistrano to deploy our application (stopping and restarting 
> backgroundrb after the rails app has been updated).  It would be great 
> if we could have more predictability regarding shutting down 
> backgroundrb (i.e. have the backgroundrb disable the reactor loop in 
> idle workers and wait for all active workers to finish methods, then 
> shutdown").
> 
> John.
> 
> Jonathan Wallace wrote:
> > Hi Ryan,
> >
> > I recently ran into the same issue where the backgroundrb process
> > would not respond to ./script/backgroundrb stop command.  The pid file
> > was being deleted but the actual process was not being killed. 
I''m
> > running packet 0.1.12 on gentoo.
> >
> > I''m not exactly sure what conditions put backgroundrb into
such a
> > state but I''ve decided to modify the script/backgroundrb to
behave a
> > little differently.
> >
> > My hypothesis is that if one of the Process.kill method calls in
> > script/backgroundrb raises an exception, the pid file is deleted even
> > though the kill signal is never sent.  At this point, running starting
> > and stopping backgroundrb never affects the original still running
> > backgroundrb process.
> >
> > There are a couple of reasons that I believe an exception could be
> > raised.  Either the Process.getpgid(pid),
Process.kill(''TERM'', pid) or
> > the PRocess.kill(''-TERM'', pgid) raise an exception
or the effective
> > uid of the user running script/backgroundrb stop does not have
> > permission to kill those processes.
> >
> > To fix this, we''ve removed the Process.getpgid and the two
> > Process.kill''s that are sending the TERM signal.  Since
we''ve
> > architected our backgroundrb jobs to be persistent and idempotent (a
> > db backed queue written before the feature appeared in bdrb),
we''ll
> > just use the KILL signal.
> >
> > Thoughts?
> >
> > Thanks,
> >  Jonathan
> >
> > On Tue, Sep 16, 2008 at 12:11 PM, Ryan Case <mrryancase at
gmail.com> wrote:
> >   
> >> Hi folks -
> >>
> >> I''m having trouble getting backgroundrb to stop after one
of the
> >> packet_worker_r processes dies.
> >>
> >> If backgroundrb is running properly,
> >> "/path/to/application/script/backgroundrb stop" works
fine, but often
> >> one of the packet_worker_r processes dies, and the stop command no
> >> longer works after that (it runs, but it does not stop the
processes,
> >> and so then start doesn''t work).
> >>
> >> The only thing that seems to work at that point is to manually
kill
> >> the processes that are still running, and then the start works,
but
> >> that is going to make restarting via monit a lot less clean.
> >>
> >> Any ideas would be much appreciated!
> >>
> >> I''m using github version of backgroundrb, and packet
0.1.13 running on ubuntu.
> >>
> >> Thanks!
> >> Ryan
> >> _______________________________________________
> >> Backgroundrb-devel mailing list
> >> Backgroundrb-devel at rubyforge.org
> >> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
> >>
> >>     
> > _______________________________________________
> > Backgroundrb-devel mailing list
> > Backgroundrb-devel at rubyforge.org
> > http://rubyforge.org/mailman/listinfo/backgroundrb-devel
> >   
> 
>

John O''Shea

2008-Sep-18 10:21 UTC

head link

[Backgroundrb-devel] trouble stopping backgroundrb

Slight variation that
- deletes pid for already-gone processes
- exits (with errror code -1) without deleting the pid file if there was 
a permission problem

     begin
-      pgid =  Process.getpgid(pid)
-      Process.kill(''TERM'', pid)
-      Process.kill(''-TERM'', pgid)
-      Process.kill(''KILL'', pid)
-    rescue Errno::ESRCH => e
-      puts "Deleting pid file"
-    rescue
+      pgid =  Process.getpgid(pid)     
+      Process.kill(''-TERM'', pgid)     
+    rescue Errno::ESRCH
+      puts $!
+      # No process - Do nothing.
+    rescue Errno::EPERM
+      # Permission denied.   
+      puts $!
+      Process.exit!
    ensure   
      File.delete(arg_pid_file) if File.exists?(arg_pid_file)
    end 

hemant kumar wrote:> Okay folks here is a patch to "backgroundrb" script, which should
fix
> some issues:
>
> diff --git a/script/backgroundrb b/script/backgroundrb
> index dabf80b..8d4bb78 100755
> --- a/script/backgroundrb
> +++ b/script/backgroundrb
> @@ -49,18 +49,9 @@ when ''stop''
>    def kill_process arg_pid_file
>      pid = nil
>      File.open(arg_pid_file, "r") { |pid_handle| pid >
pid_handle.gets.strip.chomp.to_i }
> -    begin
> -      pgid =  Process.getpgid(pid)
> -      Process.kill(''TERM'', pid)
> -      Process.kill(''-TERM'', pgid)
> -      Process.kill(''KILL'', pid)
> -    rescue Errno::ESRCH => e
> -      puts "Deleting pid file"
> -    rescue
> -      puts $!
> -    ensure
> -      File.delete(arg_pid_file) if File.exists?(arg_pid_file)
> -    end
> +    pgid =  Process.getpgid(pid)
> +    Process.kill(''-TERM'', pgid)
> +    File.delete(arg_pid_file) if File.exists?(arg_pid_file)
>    end
>    pid_files = Dir["#{RAILS_HOME}/tmp/pids/backgroundrb_*.pid"]
>    pid_files.each { |x| kill_process(x) }
>
> What it does is:
> 1. Deleting by group id is enough for master process. 
> 2. Do not delete the pid file if, there was an exception while stopping
> the daemon.
> 3. Do not handle exceptions silently.
>
> Please try this and let me know, how it goes.
>
>
>
> On Wed, 2008-09-17 at 17:35 +0100, John O''Shea wrote:
>   
>> Jonathan,
>>     Glad you raised this, I''ve been spending some time trying
to
>> diagnose this exact same problem. 
>>     The exception handling code in the "when
''stop''" block (in
>> script/backgroundrb) could definitely could be improved somewhat
>> - check that the process with ''pid'' exists before
trying to kill it
>> - rescue permission exceptions (Errno::EPERM)
>> - only delete the pid file if the process pid does not still exist (in 
>> ensure block)
>> - be a little more verbose to stdout/stderr
>>
>> While we are on the subject of shutdown, - when the backgroundrb
process
>> gets a HUP signal does it wait for existing workers to complete any
work
>> methods that are executing or is the
''Process.kill(''-TERM'', pgid)'' call
>> intended to make the OS handle this? 
>>
>> We use capistrano to deploy our application (stopping and restarting 
>> backgroundrb after the rails app has been updated).  It would be great 
>> if we could have more predictability regarding shutting down 
>> backgroundrb (i.e. have the backgroundrb disable the reactor loop in 
>> idle workers and wait for all active workers to finish methods, then 
>> shutdown").
>>
>> John.
>>
>> Jonathan Wallace wrote:
>>     
>>> Hi Ryan,
>>>
>>> I recently ran into the same issue where the backgroundrb process
>>> would not respond to ./script/backgroundrb stop command.  The pid
file
>>> was being deleted but the actual process was not being killed. 
I''m
>>> running packet 0.1.12 on gentoo.
>>>
>>> I''m not exactly sure what conditions put backgroundrb into
such a
>>> state but I''ve decided to modify the script/backgroundrb
to behave a
>>> little differently.
>>>
>>> My hypothesis is that if one of the Process.kill method calls in
>>> script/backgroundrb raises an exception, the pid file is deleted
even
>>> though the kill signal is never sent.  At this point, running
starting
>>> and stopping backgroundrb never affects the original still running
>>> backgroundrb process.
>>>
>>> There are a couple of reasons that I believe an exception could be
>>> raised.  Either the Process.getpgid(pid),
Process.kill(''TERM'', pid) or
>>> the PRocess.kill(''-TERM'', pgid) raise an
exception or the effective
>>> uid of the user running script/backgroundrb stop does not have
>>> permission to kill those processes.
>>>
>>> To fix this, we''ve removed the Process.getpgid and the two
>>> Process.kill''s that are sending the TERM signal.  Since
we''ve
>>> architected our backgroundrb jobs to be persistent and idempotent
(a
>>> db backed queue written before the feature appeared in bdrb),
we''ll
>>> just use the KILL signal.
>>>
>>> Thoughts?
>>>
>>> Thanks,
>>>  Jonathan
>>>
>>> On Tue, Sep 16, 2008 at 12:11 PM, Ryan Case <mrryancase at
gmail.com> wrote:
>>>   
>>>       
>>>> Hi folks -
>>>>
>>>> I''m having trouble getting backgroundrb to stop after
one of the
>>>> packet_worker_r processes dies.
>>>>
>>>> If backgroundrb is running properly,
>>>> "/path/to/application/script/backgroundrb stop" works
fine, but often
>>>> one of the packet_worker_r processes dies, and the stop command
no
>>>> longer works after that (it runs, but it does not stop the
processes,
>>>> and so then start doesn''t work).
>>>>
>>>> The only thing that seems to work at that point is to manually
kill
>>>> the processes that are still running, and then the start works,
but
>>>> that is going to make restarting via monit a lot less clean.
>>>>
>>>> Any ideas would be much appreciated!
>>>>
>>>> I''m using github version of backgroundrb, and packet
0.1.13 running on ubuntu.
>>>>
>>>> Thanks!
>>>> Ryan
>>>> _______________________________________________
>>>> Backgroundrb-devel mailing list
>>>> Backgroundrb-devel at rubyforge.org
>>>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>>>>
>>>>     
>>>>         
>>> _______________________________________________
>>> Backgroundrb-devel mailing list
>>> Backgroundrb-devel at rubyforge.org
>>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>>>   
>>>       
>>     
>
>   

-- 
John O''Shea, CTO at Nooked
www: http://www.nooked.com/
cell: +353 87 992 9959
skype: joshea

Woody Peterson

2008-Sep-18 18:24 UTC

head link

[Backgroundrb-devel] trouble stopping backgroundrb

In my particular case I know it''s not a permissions issue, as
I''m
always using the same user.

I just tried restarting it, and with Hemant''s patch I got:

script/backgroundrb:52:in `getpgid'': No such process (Errno::ESRCH)

Via the above I found that in this particular case what happened is  
that my logrotate wasn''t calling stop, only start (it meant to call  
stop, but was in a failing if statement checking if the pid existed).  
When you call start, it doesn''t check to see if it''s already
running,
so it starts backgroundrb, overwrites the pid file, then backgroundrb  
fails to start but has had it''s pid file changed. The original process
is still running, but can''t stop because it doesn''t have the
correct
pid in the pid file.

Thus, I rewrote script/backgroundrb to be more LSB compliant
(http://refspecs.linux-foundation.org/LSB_3.2.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html
) so I don''t have to check for existing pid files myself. I made a  
patch, but it''s almost as big as the script itself and Hemants patch  
didn''t apply for me (I must have changed something earlier in the  
file), so the whole thing is at the end of the email.

While we''re on the topic, is there a place to load all the  
requirements other than this file? backgroundrb status takes a matter  
of seconds to do a simple File.exists?(pid) ''cuz it has to load all  
the backgroundrb requirements. Not that it really matters...

-Woody

#!/usr/bin/env ruby

RAILS_HOME = File.expand_path(File.join(File.dirname(__FILE__),".."))
BDRB_HOME =
File.join(RAILS_HOME,"vendor","plugins","backgroundrb")
WORKER_ROOT = File.join(RAILS_HOME,"lib","workers")
WORKER_LOAD_ENV =
File.join(RAILS_HOME,"script","load_worker_env")

["server","server/lib","lib","lib/backgroundrb"].each
{ |x|
$LOAD_PATH.unshift(BDRB_HOME + "/#{x}")}
$LOAD_PATH.unshift(WORKER_ROOT)

require "rubygems"
require "yaml"
require "erb"
require "logger"
require "packet"
require "optparse"

require "bdrb_config"
require RAILS_HOME + "/config/boot"
require "active_support"

BackgrounDRb::Config.parse_cmd_options ARGV
BDRB_CONFIG = BackgrounDRb::Config.read_config("#{RAILS_HOME}/config/ 
backgroundrb.yml")

require RAILS_HOME + "/config/environment"
require "bdrb_job_queue"
require "backgroundrb_server"

PID_FILE = "#{RAILS_HOME}/tmp/pids/ 
backgroundrb_#{BDRB_CONFIG[:backgroundrb][:port]}.pid"
SERVER_LOGGER = "#{RAILS_HOME}/log/ 
backgroundrb_debug_#{BDRB_CONFIG[:backgroundrb][:port]}.log"

def kill_process arg_pid_file
   pid = nil
   File.open(arg_pid_file, "r") { |pid_handle| pid =  
pid_handle.gets.strip.chomp.to_i }
   pgid =  Process.getpgid(pid)
   puts "stopping backgroundrb"
   Process.kill(''-TERM'', pgid)
   File.delete(arg_pid_file) if File.exists?(arg_pid_file)
end

def status
   File.exists?(PID_FILE)
end

def start
   if fork
     sleep(5)
     exit
   else
     if status
       puts "already running"
       exit
     end

     puts "starting backgroundrb"

     op = File.open(PID_FILE, "w")
     op.write(Process.pid().to_s)
     op.close
     if BDRB_CONFIG[:backgroundrb][:log].nil? or  
BDRB_CONFIG[:backgroundrb][:log] != ''foreground''
       log_file = File.open(SERVER_LOGGER,"w+")
       [STDIN, STDOUT, STDERR].each {|desc| desc.reopen(log_file)}
     end

     BackgrounDRb::MasterProxy.new()
   end
end

def stop
   pid_files = Dir["#{RAILS_HOME}/tmp/pids/backgroundrb_*.pid"]
   pid_files.each { |x| kill_process(x) }
end

case ARGV[0]
when ''start''
   start
when ''stop''
   stop
when ''restart''
   stop
   start
when ''status''
   if status
     puts "running"
     exit
   else
     puts "not running"
     exit!(3)
   end
else
   BackgrounDRb::MasterProxy.new()
end


On Sep 18, 2008, at 3:21 AM, John O''Shea wrote:
> Slight variation that
> - deletes pid for already-gone processes
> - exits (with errror code -1) without deleting the pid file if there  
> was a permission problem
>
>    begin
> -      pgid =  Process.getpgid(pid)
> -      Process.kill(''TERM'', pid)
> -      Process.kill(''-TERM'', pgid)
> -      Process.kill(''KILL'', pid)
> -    rescue Errno::ESRCH => e
> -      puts "Deleting pid file"
> -    rescue
> +      pgid =  Process.getpgid(pid)     +     
Process.kill(''-TERM'',
> pgid)     +    rescue Errno::ESRCH
> +      puts $!
> +      # No process - Do nothing.
> +    rescue Errno::EPERM
> +      # Permission denied.   +      puts $!
> +      Process.exit!
>   ensure        File.delete(arg_pid_file) if File.exists? 
> (arg_pid_file)
>   end
> hemant kumar wrote:
>> Okay folks here is a patch to "backgroundrb" script, which
should fix
>> some issues:
>>
>> diff --git a/script/backgroundrb b/script/backgroundrb
>> index dabf80b..8d4bb78 100755
>> --- a/script/backgroundrb
>> +++ b/script/backgroundrb
>> @@ -49,18 +49,9 @@ when ''stop''
>>   def kill_process arg_pid_file
>>     pid = nil
>>     File.open(arg_pid_file, "r") { |pid_handle| pid >>
pid_handle.gets.strip.chomp.to_i }
>> -    begin
>> -      pgid =  Process.getpgid(pid)
>> -      Process.kill(''TERM'', pid)
>> -      Process.kill(''-TERM'', pgid)
>> -      Process.kill(''KILL'', pid)
>> -    rescue Errno::ESRCH => e
>> -      puts "Deleting pid file"
>> -    rescue
>> -      puts $!
>> -    ensure
>> -      File.delete(arg_pid_file) if File.exists?(arg_pid_file)
>> -    end
>> +    pgid =  Process.getpgid(pid)
>> +    Process.kill(''-TERM'', pgid)
>> +    File.delete(arg_pid_file) if File.exists?(arg_pid_file)
>>   end
>>   pid_files =
Dir["#{RAILS_HOME}/tmp/pids/backgroundrb_*.pid"]
>>   pid_files.each { |x| kill_process(x) }
>>
>> What it does is:
>> 1. Deleting by group id is enough for master process. 2. Do not  
>> delete the pid file if, there was an exception while stopping
>> the daemon.
>> 3. Do not handle exceptions silently.
>>
>> Please try this and let me know, how it goes.
>>
>>
>>
>> On Wed, 2008-09-17 at 17:35 +0100, John O''Shea wrote:
>>
>>> Jonathan,
>>>    Glad you raised this, I''ve been spending some time
trying to
>>> diagnose this exact same problem.     The exception handling code  
>>> in the "when ''stop''" block (in
script/backgroundrb) could
>>> definitely could be improved somewhat
>>> - check that the process with ''pid'' exists before
trying to kill it
>>> - rescue permission exceptions (Errno::EPERM)
>>> - only delete the pid file if the process pid does not still exist
>>> (in ensure block)
>>> - be a little more verbose to stdout/stderr
>>>
>>> While we are on the subject of shutdown, - when the backgroundrb  
>>> process gets a HUP signal does it wait for existing workers to  
>>> complete any work methods that are executing or is the  
>>> ''Process.kill(''-TERM'', pgid)''
call intended to make the OS handle
>>> this?
>>> We use capistrano to deploy our application (stopping and  
>>> restarting backgroundrb after the rails app has been updated).  It
>>> would be great if we could have more predictability regarding  
>>> shutting down backgroundrb (i.e. have the backgroundrb disable the
>>> reactor loop in idle workers and wait for all active workers to  
>>> finish methods, then shutdown").
>>>
>>> John.
>>>
>>> Jonathan Wallace wrote:
>>>
>>>> Hi Ryan,
>>>>
>>>> I recently ran into the same issue where the backgroundrb
process
>>>> would not respond to ./script/backgroundrb stop command.  The
pid
>>>> file
>>>> was being deleted but the actual process was not being killed. 
I''m
>>>> running packet 0.1.12 on gentoo.
>>>>
>>>> I''m not exactly sure what conditions put backgroundrb
into such a
>>>> state but I''ve decided to modify the
script/backgroundrb to
>>>> behave a
>>>> little differently.
>>>>
>>>> My hypothesis is that if one of the Process.kill method calls
in
>>>> script/backgroundrb raises an exception, the pid file is
deleted
>>>> even
>>>> though the kill signal is never sent.  At this point, running  
>>>> starting
>>>> and stopping backgroundrb never affects the original still
running
>>>> backgroundrb process.
>>>>
>>>> There are a couple of reasons that I believe an exception could
be
>>>> raised.  Either the Process.getpgid(pid),
Process.kill(''TERM'',
>>>> pid) or
>>>> the PRocess.kill(''-TERM'', pgid) raise an
exception or the effective
>>>> uid of the user running script/backgroundrb stop does not have
>>>> permission to kill those processes.
>>>>
>>>> To fix this, we''ve removed the Process.getpgid and the
two
>>>> Process.kill''s that are sending the TERM signal. 
Since we''ve
>>>> architected our backgroundrb jobs to be persistent and
idempotent
>>>> (a
>>>> db backed queue written before the feature appeared in bdrb),
we''ll
>>>> just use the KILL signal.
>>>>
>>>> Thoughts?
>>>>
>>>> Thanks,
>>>> Jonathan
>>>>
>>>> On Tue, Sep 16, 2008 at 12:11 PM, Ryan Case  
>>>> <mrryancase at gmail.com> wrote:
>>>>
>>>>> Hi folks -
>>>>>
>>>>> I''m having trouble getting backgroundrb to stop
after one of the
>>>>> packet_worker_r processes dies.
>>>>>
>>>>> If backgroundrb is running properly,
>>>>> "/path/to/application/script/backgroundrb stop"
works fine, but
>>>>> often
>>>>> one of the packet_worker_r processes dies, and the stop
command no
>>>>> longer works after that (it runs, but it does not stop the
>>>>> processes,
>>>>> and so then start doesn''t work).
>>>>>
>>>>> The only thing that seems to work at that point is to
manually
>>>>> kill
>>>>> the processes that are still running, and then the start
works,
>>>>> but
>>>>> that is going to make restarting via monit a lot less
clean.
>>>>>
>>>>> Any ideas would be much appreciated!
>>>>>
>>>>> I''m using github version of backgroundrb, and
packet 0.1.13
>>>>> running on ubuntu.
>>>>>
>>>>> Thanks!
>>>>> Ryan
>>>>> _______________________________________________
>>>>> Backgroundrb-devel mailing list
>>>>> Backgroundrb-devel at rubyforge.org
>>>>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Backgroundrb-devel mailing list
>>>> Backgroundrb-devel at rubyforge.org
>>>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>>>>
>>>
>>
>>
>
>
> -- 
> John O''Shea, CTO at Nooked
> www: http://www.nooked.com/
> cell: +353 87 992 9959
> skype: joshea
>
> _______________________________________________
> Backgroundrb-devel mailing list
> Backgroundrb-devel at rubyforge.org
> http://rubyforge.org/mailman/listinfo/backgroundrb-devel

hemant kumar

2008-Sep-19 03:14 UTC

head link

[Backgroundrb-devel] trouble stopping backgroundrb

Okay,

So, Did you find out, why "stop" didn''t work from logrotate,
in first
place. I think, thats rather critical.


On Thu, 2008-09-18 at 11:24 -0700, Woody Peterson wrote:> In my particular case I know it''s not a permissions issue, as
I''m
> always using the same user.
> 
> I just tried restarting it, and with Hemant''s patch I got:
> 
> script/backgroundrb:52:in `getpgid'': No such process
(Errno::ESRCH)
> 
> Via the above I found that in this particular case what happened is  
> that my logrotate wasn''t calling stop, only start (it meant to
call
> stop, but was in a failing if statement checking if the pid existed).  
> When you call start, it doesn''t check to see if it''s
already running,
> so it starts backgroundrb, overwrites the pid file, then backgroundrb  
> fails to start but has had it''s pid file changed. The original
process
> is still running, but can''t stop because it doesn''t have
the correct
> pid in the pid file.
> 
> Thus, I rewrote script/backgroundrb to be more LSB compliant
(http://refspecs.linux-foundation.org/LSB_3.2.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html
> ) so I don''t have to check for existing pid files myself. I made a
> patch, but it''s almost as big as the script itself and Hemants
patch
> didn''t apply for me (I must have changed something earlier in the
> file), so the whole thing is at the end of the email.
> 
> While we''re on the topic, is there a place to load all the  
> requirements other than this file? backgroundrb status takes a matter  
> of seconds to do a simple File.exists?(pid) ''cuz it has to load
all
> the backgroundrb requirements. Not that it really matters...
> 
> -Woody
> 
> #!/usr/bin/env ruby
> 
> RAILS_HOME =
File.expand_path(File.join(File.dirname(__FILE__),".."))
> BDRB_HOME =
File.join(RAILS_HOME,"vendor","plugins","backgroundrb")
> WORKER_ROOT = File.join(RAILS_HOME,"lib","workers")
> WORKER_LOAD_ENV =
File.join(RAILS_HOME,"script","load_worker_env")
> 
>
["server","server/lib","lib","lib/backgroundrb"].each
{ |x|
> $LOAD_PATH.unshift(BDRB_HOME + "/#{x}")}
> $LOAD_PATH.unshift(WORKER_ROOT)
> 
> require "rubygems"
> require "yaml"
> require "erb"
> require "logger"
> require "packet"
> require "optparse"
> 
> require "bdrb_config"
> require RAILS_HOME + "/config/boot"
> require "active_support"
> 
> BackgrounDRb::Config.parse_cmd_options ARGV
> BDRB_CONFIG = BackgrounDRb::Config.read_config("#{RAILS_HOME}/config/ 
> backgroundrb.yml")
> 
> require RAILS_HOME + "/config/environment"
> require "bdrb_job_queue"
> require "backgroundrb_server"
> 
> PID_FILE = "#{RAILS_HOME}/tmp/pids/ 
> backgroundrb_#{BDRB_CONFIG[:backgroundrb][:port]}.pid"
> SERVER_LOGGER = "#{RAILS_HOME}/log/ 
> backgroundrb_debug_#{BDRB_CONFIG[:backgroundrb][:port]}.log"
> 
> def kill_process arg_pid_file
>    pid = nil
>    File.open(arg_pid_file, "r") { |pid_handle| pid =  
> pid_handle.gets.strip.chomp.to_i }
>    pgid =  Process.getpgid(pid)
>    puts "stopping backgroundrb"
>    Process.kill(''-TERM'', pgid)
>    File.delete(arg_pid_file) if File.exists?(arg_pid_file)
> end
> 
> def status
>    File.exists?(PID_FILE)
> end
> 
> def start
>    if fork
>      sleep(5)
>      exit
>    else
>      if status
>        puts "already running"
>        exit
>      end
> 
>      puts "starting backgroundrb"
> 
>      op = File.open(PID_FILE, "w")
>      op.write(Process.pid().to_s)
>      op.close
>      if BDRB_CONFIG[:backgroundrb][:log].nil? or  
> BDRB_CONFIG[:backgroundrb][:log] != ''foreground''
>        log_file = File.open(SERVER_LOGGER,"w+")
>        [STDIN, STDOUT, STDERR].each {|desc| desc.reopen(log_file)}
>      end
> 
>      BackgrounDRb::MasterProxy.new()
>    end
> end
> 
> def stop
>    pid_files = Dir["#{RAILS_HOME}/tmp/pids/backgroundrb_*.pid"]
>    pid_files.each { |x| kill_process(x) }
> end
> 
> case ARGV[0]
> when ''start''
>    start
> when ''stop''
>    stop
> when ''restart''
>    stop
>    start
> when ''status''
>    if status
>      puts "running"
>      exit
>    else
>      puts "not running"
>      exit!(3)
>    end
> else
>    BackgrounDRb::MasterProxy.new()
> end
> 
> 
> On Sep 18, 2008, at 3:21 AM, John O''Shea wrote:
> 
> > Slight variation that
> > - deletes pid for already-gone processes
> > - exits (with errror code -1) without deleting the pid file if there  
> > was a permission problem
> >
> >    begin
> > -      pgid =  Process.getpgid(pid)
> > -      Process.kill(''TERM'', pid)
> > -      Process.kill(''-TERM'', pgid)
> > -      Process.kill(''KILL'', pid)
> > -    rescue Errno::ESRCH => e
> > -      puts "Deleting pid file"
> > -    rescue
> > +      pgid =  Process.getpgid(pid)     +     
Process.kill(''-TERM'',
> > pgid)     +    rescue Errno::ESRCH
> > +      puts $!
> > +      # No process - Do nothing.
> > +    rescue Errno::EPERM
> > +      # Permission denied.   +      puts $!
> > +      Process.exit!
> >   ensure        File.delete(arg_pid_file) if File.exists? 
> > (arg_pid_file)
> >   end
> > hemant kumar wrote:
> >> Okay folks here is a patch to "backgroundrb" script,
which should fix
> >> some issues:
> >>
> >> diff --git a/script/backgroundrb b/script/backgroundrb
> >> index dabf80b..8d4bb78 100755
> >> --- a/script/backgroundrb
> >> +++ b/script/backgroundrb
> >> @@ -49,18 +49,9 @@ when ''stop''
> >>   def kill_process arg_pid_file
> >>     pid = nil
> >>     File.open(arg_pid_file, "r") { |pid_handle| pid >
>> pid_handle.gets.strip.chomp.to_i }
> >> -    begin
> >> -      pgid =  Process.getpgid(pid)
> >> -      Process.kill(''TERM'', pid)
> >> -      Process.kill(''-TERM'', pgid)
> >> -      Process.kill(''KILL'', pid)
> >> -    rescue Errno::ESRCH => e
> >> -      puts "Deleting pid file"
> >> -    rescue
> >> -      puts $!
> >> -    ensure
> >> -      File.delete(arg_pid_file) if File.exists?(arg_pid_file)
> >> -    end
> >> +    pgid =  Process.getpgid(pid)
> >> +    Process.kill(''-TERM'', pgid)
> >> +    File.delete(arg_pid_file) if File.exists?(arg_pid_file)
> >>   end
> >>   pid_files =
Dir["#{RAILS_HOME}/tmp/pids/backgroundrb_*.pid"]
> >>   pid_files.each { |x| kill_process(x) }
> >>
> >> What it does is:
> >> 1. Deleting by group id is enough for master process. 2. Do not  
> >> delete the pid file if, there was an exception while stopping
> >> the daemon.
> >> 3. Do not handle exceptions silently.
> >>
> >> Please try this and let me know, how it goes.
> >>
> >>
> >>
> >> On Wed, 2008-09-17 at 17:35 +0100, John O''Shea wrote:
> >>
> >>> Jonathan,
> >>>    Glad you raised this, I''ve been spending some time
trying to
> >>> diagnose this exact same problem.     The exception handling
code
> >>> in the "when ''stop''" block (in
script/backgroundrb) could
> >>> definitely could be improved somewhat
> >>> - check that the process with ''pid'' exists
before trying to kill it
> >>> - rescue permission exceptions (Errno::EPERM)
> >>> - only delete the pid file if the process pid does not still
exist
> >>> (in ensure block)
> >>> - be a little more verbose to stdout/stderr
> >>>
> >>> While we are on the subject of shutdown, - when the
backgroundrb
> >>> process gets a HUP signal does it wait for existing workers to
> >>> complete any work methods that are executing or is the  
> >>> ''Process.kill(''-TERM'',
pgid)'' call intended to make the OS handle
> >>> this?
> >>> We use capistrano to deploy our application (stopping and  
> >>> restarting backgroundrb after the rails app has been updated).
It
> >>> would be great if we could have more predictability regarding
> >>> shutting down backgroundrb (i.e. have the backgroundrb disable
the
> >>> reactor loop in idle workers and wait for all active workers
to
> >>> finish methods, then shutdown").
> >>>
> >>> John.
> >>>
> >>> Jonathan Wallace wrote:
> >>>
> >>>> Hi Ryan,
> >>>>
> >>>> I recently ran into the same issue where the backgroundrb
process
> >>>> would not respond to ./script/backgroundrb stop command. 
The pid
> >>>> file
> >>>> was being deleted but the actual process was not being
killed.  I''m
> >>>> running packet 0.1.12 on gentoo.
> >>>>
> >>>> I''m not exactly sure what conditions put
backgroundrb into such a
> >>>> state but I''ve decided to modify the
script/backgroundrb to
> >>>> behave a
> >>>> little differently.
> >>>>
> >>>> My hypothesis is that if one of the Process.kill method
calls in
> >>>> script/backgroundrb raises an exception, the pid file is
deleted
> >>>> even
> >>>> though the kill signal is never sent.  At this point,
running
> >>>> starting
> >>>> and stopping backgroundrb never affects the original still
running
> >>>> backgroundrb process.
> >>>>
> >>>> There are a couple of reasons that I believe an exception
could be
> >>>> raised.  Either the Process.getpgid(pid),
Process.kill(''TERM'',
> >>>> pid) or
> >>>> the PRocess.kill(''-TERM'', pgid) raise an
exception or the effective
> >>>> uid of the user running script/backgroundrb stop does not
have
> >>>> permission to kill those processes.
> >>>>
> >>>> To fix this, we''ve removed the Process.getpgid
and the two
> >>>> Process.kill''s that are sending the TERM signal. 
Since we''ve
> >>>> architected our backgroundrb jobs to be persistent and
idempotent
> >>>> (a
> >>>> db backed queue written before the feature appeared in
bdrb), we''ll
> >>>> just use the KILL signal.
> >>>>
> >>>> Thoughts?
> >>>>
> >>>> Thanks,
> >>>> Jonathan
> >>>>
> >>>> On Tue, Sep 16, 2008 at 12:11 PM, Ryan Case  
> >>>> <mrryancase at gmail.com> wrote:
> >>>>
> >>>>> Hi folks -
> >>>>>
> >>>>> I''m having trouble getting backgroundrb to
stop after one of the
> >>>>> packet_worker_r processes dies.
> >>>>>
> >>>>> If backgroundrb is running properly,
> >>>>> "/path/to/application/script/backgroundrb
stop" works fine, but
> >>>>> often
> >>>>> one of the packet_worker_r processes dies, and the
stop command no
> >>>>> longer works after that (it runs, but it does not stop
the
> >>>>> processes,
> >>>>> and so then start doesn''t work).
> >>>>>
> >>>>> The only thing that seems to work at that point is to
manually
> >>>>> kill
> >>>>> the processes that are still running, and then the
start works,
> >>>>> but
> >>>>> that is going to make restarting via monit a lot less
clean.
> >>>>>
> >>>>> Any ideas would be much appreciated!
> >>>>>
> >>>>> I''m using github version of backgroundrb, and
packet 0.1.13
> >>>>> running on ubuntu.
> >>>>>
> >>>>> Thanks!
> >>>>> Ryan
> >>>>> _______________________________________________
> >>>>> Backgroundrb-devel mailing list
> >>>>> Backgroundrb-devel at rubyforge.org
> >>>>>
http://rubyforge.org/mailman/listinfo/backgroundrb-devel
> >>>>>
> >>>>>
> >>>> _______________________________________________
> >>>> Backgroundrb-devel mailing list
> >>>> Backgroundrb-devel at rubyforge.org
> >>>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
> >>>>
> >>>
> >>
> >>
> >
> >
> > -- 
> > John O''Shea, CTO at Nooked
> > www: http://www.nooked.com/
> > cell: +353 87 992 9959
> > skype: joshea
> >
> > _______________________________________________
> > Backgroundrb-devel mailing list
> > Backgroundrb-devel at rubyforge.org
> > http://rubyforge.org/mailman/listinfo/backgroundrb-devel
> 
> _______________________________________________
> Backgroundrb-devel mailing list
> Backgroundrb-devel at rubyforge.org
> http://rubyforge.org/mailman/listinfo/backgroundrb-devel

Ryan Case

2008-Sep-26 23:30 UTC

head link

[Backgroundrb-devel] trouble stopping backgroundrb

Thanks for the patch - this works much better.

Occasionally, I still have to "pkill -9 -f backgroundrb", but most of
the time just the stop script will clean up when one of the  
packet_worker processes dies.

Thanks,
Ryan


On Sep 17, 2008, at 6:08 PM, hemant kumar wrote:
> Okay folks here is a patch to "backgroundrb" script, which should
fix
> some issues:
>
> diff --git a/script/backgroundrb b/script/backgroundrb
> index dabf80b..8d4bb78 100755
> --- a/script/backgroundrb
> +++ b/script/backgroundrb
> @@ -49,18 +49,9 @@ when ''stop''
>   def kill_process arg_pid_file
>     pid = nil
>     File.open(arg_pid_file, "r") { |pid_handle| pid >
pid_handle.gets.strip.chomp.to_i }
> -    begin
> -      pgid =  Process.getpgid(pid)
> -      Process.kill(''TERM'', pid)
> -      Process.kill(''-TERM'', pgid)
> -      Process.kill(''KILL'', pid)
> -    rescue Errno::ESRCH => e
> -      puts "Deleting pid file"
> -    rescue
> -      puts $!
> -    ensure
> -      File.delete(arg_pid_file) if File.exists?(arg_pid_file)
> -    end
> +    pgid =  Process.getpgid(pid)
> +    Process.kill(''-TERM'', pgid)
> +    File.delete(arg_pid_file) if File.exists?(arg_pid_file)
>   end
>   pid_files = Dir["#{RAILS_HOME}/tmp/pids/backgroundrb_*.pid"]
>   pid_files.each { |x| kill_process(x) }
>
> What it does is:
> 1. Deleting by group id is enough for master process.
> 2. Do not delete the pid file if, there was an exception while  
> stopping
> the daemon.
> 3. Do not handle exceptions silently.
>
> Please try this and let me know, how it goes.
>
>
>
> On Wed, 2008-09-17 at 17:35 +0100, John O''Shea wrote:
>> Jonathan,
>>    Glad you raised this, I''ve been spending some time trying
to
>> diagnose this exact same problem.
>>    The exception handling code in the "when
''stop''" block (in
>> script/backgroundrb) could definitely could be improved somewhat
>> - check that the process with ''pid'' exists before
trying to kill it
>> - rescue permission exceptions (Errno::EPERM)
>> - only delete the pid file if the process pid does not still exist  
>> (in
>> ensure block)
>> - be a little more verbose to stdout/stderr
>>
>> While we are on the subject of shutdown, - when the backgroundrb  
>> process
>> gets a HUP signal does it wait for existing workers to complete any  
>> work
>> methods that are executing or is the
''Process.kill(''-TERM'', pgid)''
>> call
>> intended to make the OS handle this?
>>
>> We use capistrano to deploy our application (stopping and restarting
>> backgroundrb after the rails app has been updated).  It would be  
>> great
>> if we could have more predictability regarding shutting down
>> backgroundrb (i.e. have the backgroundrb disable the reactor loop in
>> idle workers and wait for all active workers to finish methods, then
>> shutdown").
>>
>> John.
>>
>> Jonathan Wallace wrote:
>>> Hi Ryan,
>>>
>>> I recently ran into the same issue where the backgroundrb process
>>> would not respond to ./script/backgroundrb stop command.  The pid  
>>> file
>>> was being deleted but the actual process was not being killed. 
I''m
>>> running packet 0.1.12 on gentoo.
>>>
>>> I''m not exactly sure what conditions put backgroundrb into
such a
>>> state but I''ve decided to modify the script/backgroundrb
to behave a
>>> little differently.
>>>
>>> My hypothesis is that if one of the Process.kill method calls in
>>> script/backgroundrb raises an exception, the pid file is deleted  
>>> even
>>> though the kill signal is never sent.  At this point, running  
>>> starting
>>> and stopping backgroundrb never affects the original still running
>>> backgroundrb process.
>>>
>>> There are a couple of reasons that I believe an exception could be
>>> raised.  Either the Process.getpgid(pid),
Process.kill(''TERM'',
>>> pid) or
>>> the PRocess.kill(''-TERM'', pgid) raise an
exception or the effective
>>> uid of the user running script/backgroundrb stop does not have
>>> permission to kill those processes.
>>>
>>> To fix this, we''ve removed the Process.getpgid and the two
>>> Process.kill''s that are sending the TERM signal.  Since
we''ve
>>> architected our backgroundrb jobs to be persistent and idempotent
(a
>>> db backed queue written before the feature appeared in bdrb),
we''ll
>>> just use the KILL signal.
>>>
>>> Thoughts?
>>>
>>> Thanks,
>>> Jonathan
>>>
>>> On Tue, Sep 16, 2008 at 12:11 PM, Ryan Case <mrryancase at
gmail.com>
>>> wrote:
>>>
>>>> Hi folks -
>>>>
>>>> I''m having trouble getting backgroundrb to stop after
one of the
>>>> packet_worker_r processes dies.
>>>>
>>>> If backgroundrb is running properly,
>>>> "/path/to/application/script/backgroundrb stop" works
fine, but
>>>> often
>>>> one of the packet_worker_r processes dies, and the stop command
no
>>>> longer works after that (it runs, but it does not stop the  
>>>> processes,
>>>> and so then start doesn''t work).
>>>>
>>>> The only thing that seems to work at that point is to manually
kill
>>>> the processes that are still running, and then the start works,
but
>>>> that is going to make restarting via monit a lot less clean.
>>>>
>>>> Any ideas would be much appreciated!
>>>>
>>>> I''m using github version of backgroundrb, and packet
0.1.13
>>>> running on ubuntu.
>>>>
>>>> Thanks!
>>>> Ryan
>>>> _______________________________________________
>>>> Backgroundrb-devel mailing list
>>>> Backgroundrb-devel at rubyforge.org
>>>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>>>>
>>>>
>>> _______________________________________________
>>> Backgroundrb-devel mailing list
>>> Backgroundrb-devel at rubyforge.org
>>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>>>
>>
>>
>
> _______________________________________________
> Backgroundrb-devel mailing list
> Backgroundrb-devel at rubyforge.org
> http://rubyforge.org/mailman/listinfo/backgroundrb-devel

Backgroundrb devel - Sep 2008 - trouble stopping backgroundrb

[Backgroundrb-devel] trouble stopping backgroundrb

[Backgroundrb-devel] trouble stopping backgroundrb

[Backgroundrb-devel] trouble stopping backgroundrb

[Backgroundrb-devel] trouble stopping backgroundrb

[Backgroundrb-devel] trouble stopping backgroundrb

[Backgroundrb-devel] trouble stopping backgroundrb

[Backgroundrb-devel] trouble stopping backgroundrb

[Backgroundrb-devel] trouble stopping backgroundrb

[Backgroundrb-devel] trouble stopping backgroundrb

[Backgroundrb-devel] trouble stopping backgroundrb

[Backgroundrb-devel] trouble stopping backgroundrb