thr3ads.net - Backgroundrb devel - [Backgroundrb-devel] magical disappearing background processes! [Oct 2008]

If this information is useful, please help other people find it:
Share via:

Jack Nutting

2008-Oct-10 12:12 UTC

[Backgroundrb-devel] magical disappearing background processes!

Hi all,

I''ve been having trouble for a long time with backgroundrb processes
that suddenly vanish without a trace.  What happens is that at some
point I discover that all the backgroundrb processes are suddenly
gone.  Nothing special is seen in any of the log files.  This has
happened intermittently for a long time, and I was hoping that
upgrading to 1.0.4 would somehow help me out, but I seem to encounter
the same problem.

It happens infrequently, sometimes two-three times a week, sometimes
not at all for several weeks.  Yesterday it actually happened twice in
ten minutes during a period when the server was heavily loaded, but
that''s unusual.  Usually when it happens the server is not under a
heavy load.

Yesterday when it happened, I had the fortune of having a "top" log
running in a terminal window, so I''m able to present some more data.
top was displaying all threads, so most of the processes show up twice
or more.

I have 5 background workers running, each apparently has 2 threads,
plus log_worker with 1 thread and script/backgroundrb with 2 threads.
My architecture is set up so that only "master" is started
automatically when backgroundrb starts up, and it in turn starts the
rest.

I''m pasting in data for all the backgroundrb processes, sorry for the
terrible formatting but I can''t really think of a better way to
present this all.

Here''s what it normally looks like while everything is up and running.
 This is the last "normal" state I found before it starting going
haywire:

top - 15:11:13 up 5 days,  5:05,  3 users,  load average: 3.10, 3.09, 3.02
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
17508 deploy    15   0 49300  35m 2688 S 11.8  1.7   7:54.65
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
18:16:mblox_sender:94:/home/deploy/mbargo/lib/workers:/home/deploy/mbar
17504 deploy    15   0 49648  35m 2688 S  8.2  1.7   8:01.64
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
16:14:mblox_sender:94:/home/deploy/mbargo/lib/workers:/home/deploy/mbar
14141 deploy    15   0 20796  17m 1612 S  0.3  0.8   2:48.59
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
8:7:log_worker:17:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/s
14147 deploy    15   0 48232  34m 2556 S  0.3  1.7   5:10.90
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
11:10:master:4:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/scri
17523 deploy    17   0  132m 115m 3316 R  0.3  5.6   6:43.89
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
20:18:campaign_starter:39:/home/deploy/mbargo/lib/workers:/home/deploy/
14102 deploy    17   0 48320  31m 1364 R  0.0  1.5   3:08.97 ruby
/home/deploy/mbargo/script/backgroundrb start
14144 deploy    15   0 48320  31m 1364 S  0.0  1.5   0:45.35 ruby
/home/deploy/mbargo/script/backgroundrb start
17446 deploy    15   0 48232  34m 2556 S  0.0  1.7   0:43.62
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
11:10:master:4:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/scri
17486 deploy    15   0 59500  41m 3500 S  0.0  2.0  11:45.15
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
14:13:receiver:39:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/s
22300 deploy    15   0 59500  41m 3500 S  0.0  2.0   0:45.27
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
14:13:receiver:39:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/s
23636 deploy    15   0 49648  35m 2688 S  0.0  1.7   0:45.68
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
16:14:mblox_sender:94:/home/deploy/mbargo/lib/workers:/home/deploy/mbar
24042 deploy    15   0 49300  35m 2688 S  0.0  1.7   0:43.58
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
18:16:mblox_sender:94:/home/deploy/mbargo/lib/workers:/home/deploy/mbar
24053 deploy    15   0  132m 115m 3316 S  0.0  5.6   0:43.70
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
20:18:campaign_starter:39:/home/deploy/mbargo/lib/workers:/home/deploy/

Next snapshot, 3 seconds later.  script/backgroundrb is gone, and each
of my workers (except for master) is down to 1 thread.

top - 15:11:16 up 5 days,  5:05,  3 users,  load average: 3.10, 3.09, 3.02
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
17504 deploy    15   0 49648  35m 2688 S 12.6  1.7   8:02.02
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
16:14:mblox_sender:94:/home/deploy/mbargo/lib/workers:/home/deploy/mbar
17486 deploy    17   0 59500  41m 3500 R  0.3  2.0  11:45.16
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
14:13:receiver:39:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/s
14141 deploy    15   0 20796  17m 1612 S  0.0  0.8   2:48.59
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
8:7:log_worker:17:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/s
14147 deploy    15   0 48232  34m 2556 S  0.0  1.7   5:10.90
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
11:10:master:4:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/scri
17446 deploy    15   0 48232  34m 2556 S  0.0  1.7   0:43.62
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
11:10:master:4:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/scri
22300 deploy    15   0 59500  41m 3500 S  0.0  2.0   0:45.27
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
14:13:receiver:39:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/s
23636 deploy    15   0 49648  35m 2688 S  0.0  1.7   0:45.68
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
16:14:mblox_sender:94:/home/deploy/mbargo/lib/workers:/home/deploy/mbar

Next, 3 seconds after that,  all I have left is master (still 2
threads) and log_worker:

top - 15:11:19 up 5 days,  5:05,  3 users,  load average: 2.85, 3.03, 3.01
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
14141 deploy    15   0 20796  17m 1612 S  0.0  0.8   2:48.59
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
8:7:log_worker:17:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/s
14147 deploy    15   0 48232  34m 2556 S  0.0  1.7   5:10.90
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
11:10:master:4:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/scri
17446 deploy    15   0 48232  34m 2556 S  0.0  1.7   0:43.62
/usr/bin/ruby1.8 /usr/bin/packet_worker_runner
11:10:master:4:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/scri

At the next snapshot, all backgroundrb processes are gone.

This is running on Ubuntu 7.10, backgroundrb 1.0.4.  I''m nowhere near
maxing out system memory, and there are no memory or other limits set
on user processes as far as I can tell.  If anyone has any ideas about
what might cause this, or how to dig deeper, please let me know!  I''m
nearly at my wits'' end.

-- 
// jack
// http://www.nuthole.com

hemant kumar

2008-Oct-10 12:34 UTC

head link

[Backgroundrb-devel] magical disappearing background processes!

Are your running two copies of BackgrounDRb server on the same machine?
I see, two server instances, in your top output.


On Fri, 2008-10-10 at 14:12 +0200, Jack Nutting wrote:> Hi all,
> 
> I''ve been having trouble for a long time with backgroundrb
processes
> that suddenly vanish without a trace.  What happens is that at some
> point I discover that all the backgroundrb processes are suddenly
> gone.  Nothing special is seen in any of the log files.  This has
> happened intermittently for a long time, and I was hoping that
> upgrading to 1.0.4 would somehow help me out, but I seem to encounter
> the same problem.
> 
> It happens infrequently, sometimes two-three times a week, sometimes
> not at all for several weeks.  Yesterday it actually happened twice in
> ten minutes during a period when the server was heavily loaded, but
> that''s unusual.  Usually when it happens the server is not under a
> heavy load.
> 
> Yesterday when it happened, I had the fortune of having a "top"
log
> running in a terminal window, so I''m able to present some more
data.
> top was displaying all threads, so most of the processes show up twice
> or more.
> 
> I have 5 background workers running, each apparently has 2 threads,
> plus log_worker with 1 thread and script/backgroundrb with 2 threads.
> My architecture is set up so that only "master" is started
> automatically when backgroundrb starts up, and it in turn starts the
> rest.
> 
> I''m pasting in data for all the backgroundrb processes, sorry for
the
> terrible formatting but I can''t really think of a better way to
> present this all.
> 
> Here''s what it normally looks like while everything is up and
running.
>  This is the last "normal" state I found before it starting going
> haywire:
> 
> top - 15:11:13 up 5 days,  5:05,  3 users,  load average: 3.10, 3.09, 3.02
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 17508 deploy    15   0 49300  35m 2688 S 11.8  1.7   7:54.65
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 18:16:mblox_sender:94:/home/deploy/mbargo/lib/workers:/home/deploy/mbar
> 17504 deploy    15   0 49648  35m 2688 S  8.2  1.7   8:01.64
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 16:14:mblox_sender:94:/home/deploy/mbargo/lib/workers:/home/deploy/mbar
> 14141 deploy    15   0 20796  17m 1612 S  0.3  0.8   2:48.59
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 8:7:log_worker:17:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/s
> 14147 deploy    15   0 48232  34m 2556 S  0.3  1.7   5:10.90
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 11:10:master:4:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/scri
> 17523 deploy    17   0  132m 115m 3316 R  0.3  5.6   6:43.89
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 20:18:campaign_starter:39:/home/deploy/mbargo/lib/workers:/home/deploy/
> 14102 deploy    17   0 48320  31m 1364 R  0.0  1.5   3:08.97 ruby
> /home/deploy/mbargo/script/backgroundrb start
> 14144 deploy    15   0 48320  31m 1364 S  0.0  1.5   0:45.35 ruby
> /home/deploy/mbargo/script/backgroundrb start
> 17446 deploy    15   0 48232  34m 2556 S  0.0  1.7   0:43.62
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 11:10:master:4:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/scri
> 17486 deploy    15   0 59500  41m 3500 S  0.0  2.0  11:45.15
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 14:13:receiver:39:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/s
> 22300 deploy    15   0 59500  41m 3500 S  0.0  2.0   0:45.27
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 14:13:receiver:39:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/s
> 23636 deploy    15   0 49648  35m 2688 S  0.0  1.7   0:45.68
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 16:14:mblox_sender:94:/home/deploy/mbargo/lib/workers:/home/deploy/mbar
> 24042 deploy    15   0 49300  35m 2688 S  0.0  1.7   0:43.58
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 18:16:mblox_sender:94:/home/deploy/mbargo/lib/workers:/home/deploy/mbar
> 24053 deploy    15   0  132m 115m 3316 S  0.0  5.6   0:43.70
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 20:18:campaign_starter:39:/home/deploy/mbargo/lib/workers:/home/deploy/
> 
> Next snapshot, 3 seconds later.  script/backgroundrb is gone, and each
> of my workers (except for master) is down to 1 thread.
> 
> top - 15:11:16 up 5 days,  5:05,  3 users,  load average: 3.10, 3.09, 3.02
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 17504 deploy    15   0 49648  35m 2688 S 12.6  1.7   8:02.02
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 16:14:mblox_sender:94:/home/deploy/mbargo/lib/workers:/home/deploy/mbar
> 17486 deploy    17   0 59500  41m 3500 R  0.3  2.0  11:45.16
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 14:13:receiver:39:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/s
> 14141 deploy    15   0 20796  17m 1612 S  0.0  0.8   2:48.59
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 8:7:log_worker:17:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/s
> 14147 deploy    15   0 48232  34m 2556 S  0.0  1.7   5:10.90
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 11:10:master:4:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/scri
> 17446 deploy    15   0 48232  34m 2556 S  0.0  1.7   0:43.62
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 11:10:master:4:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/scri
> 22300 deploy    15   0 59500  41m 3500 S  0.0  2.0   0:45.27
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 14:13:receiver:39:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/s
> 23636 deploy    15   0 49648  35m 2688 S  0.0  1.7   0:45.68
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 16:14:mblox_sender:94:/home/deploy/mbargo/lib/workers:/home/deploy/mbar
> 
> Next, 3 seconds after that,  all I have left is master (still 2
> threads) and log_worker:
> 
> top - 15:11:19 up 5 days,  5:05,  3 users,  load average: 2.85, 3.03, 3.01
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 14141 deploy    15   0 20796  17m 1612 S  0.0  0.8   2:48.59
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 8:7:log_worker:17:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/s
> 14147 deploy    15   0 48232  34m 2556 S  0.0  1.7   5:10.90
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 11:10:master:4:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/scri
> 17446 deploy    15   0 48232  34m 2556 S  0.0  1.7   0:43.62
> /usr/bin/ruby1.8 /usr/bin/packet_worker_runner
> 11:10:master:4:/home/deploy/mbargo/lib/workers:/home/deploy/mbargo/scri
> 
> At the next snapshot, all backgroundrb processes are gone.
> 
> This is running on Ubuntu 7.10, backgroundrb 1.0.4.  I''m nowhere
near
> maxing out system memory, and there are no memory or other limits set
> on user processes as far as I can tell.  If anyone has any ideas about
> what might cause this, or how to dig deeper, please let me know! 
I''m
> nearly at my wits'' end.
>

Jack Nutting

2008-Oct-10 12:52 UTC

head link

[Backgroundrb-devel] magical disappearing background processes!

On Fri, Oct 10, 2008 at 2:34 PM, hemant kumar <gethemant at gmail.com>
wrote:> Are your running two copies of BackgrounDRb server on the same machine?
> I see, two server instances, in your top output.
No, it''s just one.  The mode I was running "top" in showed
one line
for each thread.

-- 
// jack
// http://www.nuthole.com

Apparently Analagous Threads

Search for more maybe matching threads

Backgroundrb devel - Oct 2008 - magical disappearing background processes!

[Backgroundrb-devel] magical disappearing background processes!

[Backgroundrb-devel] magical disappearing background processes!

[Backgroundrb-devel] magical disappearing background processes!

Apparently Analagous Threads