thr3ads.net - Backgroundrb devel - [Backgroundrb-devel] Workers sleeping habbits [Sep 2007]

If this information is useful, please help other people find it:
Share via:

Erik Morton

2007-Sep-28 14:39 UTC

[Backgroundrb-devel] Workers sleeping habbits

I''m having some issues with Workers dying after a period of several  
hours.

Each worker runs a loop that asks Amazon SQS for work to do. If there  
is a message in the queue the work is completed (image processing,  
etc...); if there is no message the worker will sleep for X seconds  
(sleep 10, etc...). I''ve noticed that the workers will frequently  
exhibit two negative behaviors: A) stop asking for requests, but  
still exist as a process; or B) die completely (no more process) with  
no errors reported in either log file.

I made a simple DeathWorker last night to try to find out exactly  
*when* death occurs. The worker will log when it asks for a message,  
when it goes to sleep, and when it wakes up. Like so:

09/27/2007 13:23:05 (7673) DeathWorker: SQSMiddleMan.next_message 
(:death_worker)
09/27/2007 13:23:05 (7673) DeathWorker: No message. Going to sleep.
09/27/2007 13:23:34 (7673) DeathWorker: Done sleeping.

The above log entries show the normal course of operation for the  
DeathWorker: look for a message, almost immediately report that there  
is no message and go to sleep for 10 seconds. Wake up and log that  
you are awake. As you can see there was more than 10 seconds between  
logging that you were going to sleep and then waking up. Is it  
possible that the log synchronization that occurs through the logging  
worker causes the delay?

This happened later in the night:

09/27/2007 13:23:38 (7673) DeathWorker: No message. Going to sleep.
09/27/2007 13:27:15 (7673) DeathWorker: Done sleeping.

Almost four minutes of sleep  when I call sleep 10. Interesting.

Later in the night:

09/27/2007 13:50:13 (7673) DeathWorker: No message. Going to sleep.
09/27/2007 19:29:36 (7673) DeathWorker: Done sleeping.

Wow! Almost 6 hours of sleeping!

After that nap the worker went for another 10 minutes or so and then  
the process actually died, with no errors reported in the log. Any  
idea what is going on? How can I debug this issue? Every time I try  
to attach to the oversleeping process with GDB it segfaults. Thanks  
in advance!

Erik

Seemingly Similar Threads

Search for more reasonably related threads

Backgroundrb devel - Sep 2007 - Workers sleeping habbits

[Backgroundrb-devel] Workers sleeping habbits

Seemingly Similar Threads

Wisdom of the Ancients