thr3ads.net - Backgroundrb devel - [Backgroundrb-devel] Job Queue [Feb 2008]

If this information is useful, please help other people find it:
Share via:

Norman Elton

2008-Feb-01 18:58 UTC

[Backgroundrb-devel] Job Queue

After some initial stumblings, I think I''ve got the hang of  
backgroundrb. It''s great! I''d been thinking for many many
months how
cool something like this would be!

I''m trying to make a "job queue". That is, a pool of worker
threads
monitor a queue. When a job appears, one of the workers grabs it and  
executes the task. When complete, the worker returns to watching the  
queue. If all workers are busy, jobs simply sit in the queue until the  
resources are available.

I can see how to build an array containing jobs. And a mutex could be  
used to ensure thread-safe access to the array. But how can the  
workers "go to sleep" until a job arrives?

Any thoughts?

Thanks again!

Norman

Ryan Leavengood

2008-Feb-01 19:19 UTC

head link

[Backgroundrb-devel] Job Queue

On Feb 1, 2008 1:58 PM, Norman Elton <normelton at gmail.com>
wrote:>
> I can see how to build an array containing jobs. And a mutex could be
> used to ensure thread-safe access to the array. But how can the
> workers "go to sleep" until a job arrives?
>
> Any thoughts?
I had a similar requirement and ended up just using a single worker
and thread_pool.defer (the documentation talks about how to use this.)
I''ve looked at the BackgrounDRb code and thread_pool.defer makes use
of a queue which a configurable number of threads read from. You can
set the size of the pool by calling pool_size in your worker. The
default pool size is 20 threads.

If you need to know the status of each thread you need to use a mutex
to synchronize access to a hashtable member variable which can contain
the status for each thread (you would need to decide what the key
would be for each thread, maybe the database ID of the job the thread
is processing.) You would then pass this hashtable to the
register_status method. There is a post from hemant on this mailing
list from December which shows how to do this pretty well:

http://rubyforge.org/pipermail/backgroundrb-devel/2007-December/001170.html

The only caveat here is the green Ruby threads used in the thread_pool
may not play well with the job processing you are doing. But from the
testing I''ve done it seems pretty good, especially for something
involving the network which probably has a lot of latency.

The general architecture I would suggest would be to have a jobs table
in the database, and when jobs are added the Rails model
(after_create) can call MiddleMan.ask_work and pass the ID of the job
just created. The worker will pass that job_id into thread_pool.defer
which will then process the job. For my own work I tend to put all the
heavy processing into separate classes which I simply call from the
worker. So for you maybe something like JobProcessor.run(job_id) or
whatever.

Regards,
Ryan

Jason LaPier

2008-Feb-01 19:37 UTC

head link

[Backgroundrb-devel] Job Queue

I was just about to reply, but you linked to my code sample already :)

The bugs we were discussing in that thread have been resolved and I''ve
been using thread pool in combination with mutex to save a hash of
"statuses" as I described in that post for a little over a month now
in production with no problems. Admittedly, the site in question is
not exactly high traffic (75k pageviews for the month of January
according to analytics), but I''ve had no stability issues whatsoever
in the last month. I run a pool of 10 threads currently, but the job
is fairly short so I doubt I''m ever using more than a couple threads
at once.

- Jason L.

-- 
My Rails and Linux Blog: http://offtheline.net


On Feb 1, 2008 11:19 AM, Ryan Leavengood <leavengood at gmail.com>
wrote:> On Feb 1, 2008 1:58 PM, Norman Elton <normelton at gmail.com> wrote:
> >
> > I can see how to build an array containing jobs. And a mutex could be
> > used to ensure thread-safe access to the array. But how can the
> > workers "go to sleep" until a job arrives?
> >
> > Any thoughts?
>
> I had a similar requirement and ended up just using a single worker
> and thread_pool.defer (the documentation talks about how to use this.)
> I''ve looked at the BackgrounDRb code and thread_pool.defer makes
use
> of a queue which a configurable number of threads read from. You can
> set the size of the pool by calling pool_size in your worker. The
> default pool size is 20 threads.
>
> If you need to know the status of each thread you need to use a mutex
> to synchronize access to a hashtable member variable which can contain
> the status for each thread (you would need to decide what the key
> would be for each thread, maybe the database ID of the job the thread
> is processing.) You would then pass this hashtable to the
> register_status method. There is a post from hemant on this mailing
> list from December which shows how to do this pretty well:
>
> http://rubyforge.org/pipermail/backgroundrb-devel/2007-December/001170.html
>
> The only caveat here is the green Ruby threads used in the thread_pool
> may not play well with the job processing you are doing. But from the
> testing I''ve done it seems pretty good, especially for something
> involving the network which probably has a lot of latency.
>
> The general architecture I would suggest would be to have a jobs table
> in the database, and when jobs are added the Rails model
> (after_create) can call MiddleMan.ask_work and pass the ID of the job
> just created. The worker will pass that job_id into thread_pool.defer
> which will then process the job. For my own work I tend to put all the
> heavy processing into separate classes which I simply call from the
> worker. So for you maybe something like JobProcessor.run(job_id) or
> whatever.
>
> Regards,
> Ryan
>
> _______________________________________________
> Backgroundrb-devel mailing list
> Backgroundrb-devel at rubyforge.org
> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>

hemant

2008-Feb-01 19:51 UTC

head link

[Backgroundrb-devel] Job Queue

On Sat, Feb 2, 2008 at 1:07 AM, Jason LaPier <jason.lapier at gmail.com>
wrote:> I was just about to reply, but you linked to my code sample already :)
>
>  The bugs we were discussing in that thread have been resolved and
I''ve
>  been using thread pool in combination with mutex to save a hash of
>  "statuses" as I described in that post for a little over a month
now
>  in production with no problems. Admittedly, the site in question is
>  not exactly high traffic (75k pageviews for the month of January
>  according to analytics), but I''ve had no stability issues
whatsoever
>  in the last month. I run a pool of 10 threads currently, but the job
>  is fairly short so I doubt I''m ever using more than a couple
threads
>  at once.
>
I have been toying with some code that I picked other Ruby projects
for implementing a job queue based on database tables.
But i absolutely don''t want to add any features without test cases in
hand.

But yeah, as told by Ryan and Json, you should have no trouble in
using thread_pool feature.

Reasonably Related Threads

Search for more possibly parallel threads

Backgroundrb devel - Feb 2008 - Job Queue

[Backgroundrb-devel] Job Queue

[Backgroundrb-devel] Job Queue

[Backgroundrb-devel] Job Queue

[Backgroundrb-devel] Job Queue

Reasonably Related Threads