thr3ads.net - Backgroundrb devel - [Backgroundrb-devel] Registering status for multithreaded worker? [Feb 2008]

If this information is useful, please help other people find it:
Share via:

Greg Campbell

2008-Feb-21 01:03 UTC

[Backgroundrb-devel] Registering status for multithreaded worker?

Hi all,

I''m using a backgroundrb worker for processing data reporting tasks
which
can be initiated by users of my rails application, and which need to support
status monitoring.  I had been spawning a new instance with a new job_id for
each task, and reporting/requesting status via that job_id.  It appears that
this sort of thing may be better handled by thread_pool, but there seem to
be two ways of dealing with status reporting, and I''m curious whether
people
have found one to be preferable over the other:

I could track status in the database, as I''m creating a new row for
each
task anyway to store the results, or I could use register_status, with a
hash keyed on the equivalent of job_id (inside a mutex, as suggested in the
README).  Is there any reason to prefer the second over the first?
Alternately, am I incorrect in assuming that thread_pool is preferable to
spawning one worker per user request?

Thanks,
Greg Campbell
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20080220/b56af37e/attachment.html

hemant

2008-Feb-21 11:54 UTC

head link

[Backgroundrb-devel] Registering status for multithreaded worker?

Hi Greg,

On Thu, Feb 21, 2008 at 6:33 AM, Greg Campbell <gtcampbell at gmail.com>
wrote:> Hi all,
>
> I''m using a backgroundrb worker for processing data reporting
tasks which
> can be initiated by users of my rails application, and which need to
support
> status monitoring.  I had been spawning a new instance with a new job_id
for
> each task, and reporting/requesting status via that job_id.  It appears
that
> this sort of thing may be better handled by thread_pool, but there seem to
> be two ways of dealing with status reporting, and I''m curious
whether people
> have found one to be preferable over the other:
>
> I could track status in the database, as I''m creating a new row
for each
> task anyway to store the results, or I could use register_status, with a
> hash keyed on the equivalent of job_id (inside a mutex, as suggested in the
> README).  Is there any reason to prefer the second over the first?
> Alternately, am I incorrect in assuming that thread_pool is preferable to
> spawning one worker per user request?
>
thread_pool is definitely preferable over one worker per request approach.

Also, usually register_status is faster than your hand rolled approach
of using databases. Also, if worker status results can be stored in
memcache clusters as well hence is preferable.

Greg Campbell

2008-Feb-26 20:27 UTC

head link

[Backgroundrb-devel] Registering status for multithreaded worker?

One followup here:

On Thu, Feb 21, 2008 at 3:54 AM, hemant <gethemant at gmail.com> wrote:
> Hi Greg,
>
> On Thu, Feb 21, 2008 at 6:33 AM, Greg Campbell <gtcampbell at
gmail.com>
> wrote:
> > Hi all,
> >
> > I''m using a backgroundrb worker for processing data reporting
tasks
> which
> > can be initiated by users of my rails application, and which need to
> support
> > status monitoring.  I had been spawning a new instance with a new
job_id
> for
> > each task, and reporting/requesting status via that job_id.  It
appears
> that
> > this sort of thing may be better handled by thread_pool, but there
seem
> to
> > be two ways of dealing with status reporting, and I''m curious
whether
> people
> > have found one to be preferable over the other:
> >
> > I could track status in the database, as I''m creating a new
row for each
> > task anyway to store the results, or I could use register_status, with
a
> > hash keyed on the equivalent of job_id (inside a mutex, as suggested
in
> the
> > README).  Is there any reason to prefer the second over the first?
> > Alternately, am I incorrect in assuming that thread_pool is preferable
> to
> > spawning one worker per user request?
> >
>
> thread_pool is definitely preferable over one worker per request approach.
>
> Also, usually register_status is faster than your hand rolled approach
> of using databases. Also, if worker status results can be stored in
> memcache clusters as well hence is preferable.
>
Things seem to be working with thread_pool, except for one issue -
ask_status returns something incorrect the first time it''s called after
an
ask_work call.  It looks like the first ask_status response is the return
value from the worker method that''s calling thread_pool.defer, when I
would
think that return value should be irrelevant (as the worker''s being
invoked
with the non-blocking ask_work).  Has anyone seen this behavior before?

  For reference, my code basically works this way (with all app-specific
stuff removed):

(controller)
def initiate_task
  @task = Task.create
  MiddleMan.ask_work(:worker => :threaded_worker, :worker_method =>
:process_task, :data => @task.id)
end

#called via AJAX polling to update progress bar
def task_status
  @task_id = params[:task_id].to_i
  status_hash = MiddleMan.ask_status(:worker => :threaded_worker)
  #do something with status_hash[@task_id]...
end

(worker)
def create
  @worker_status = {}
  @status_lock = Mutex.new
  register_status(@worker_status)
end

def process_task(task_id)
  thread_pool.defer(task_id) do |task_id|
    #do several things which call update_status...
  end
  return {:this_should_be => :irrelevant}
end

def update_status(task_id, status)
  @status_mutex.synchronize do
    @worker_status[task_id] = status
  end
  register_status(@worker_status)
end


Based on my logging in the controller, the first time task_status is called,
the status_hash retrieved looks something like this: {:type => :response,
:data => {:this_should_be => :irrelevant}, :client_signature => 11}. 
The
next time, however, it looks correct: {(task_id_1) => (task_status_1),
(task_id_2) => (task_status_2)}, etc.  Again, has anyone seen this sort of
thing before?  Am I using the API incorrectly?

Thanks,
Greg
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20080226/2184fd67/attachment.html

hemant

2008-Feb-26 21:41 UTC

head link

[Backgroundrb-devel] Registering status for multithreaded worker?

Hi Greg,

On Wed, Feb 27, 2008 at 1:57 AM, Greg Campbell <gtcampbell at gmail.com>
wrote:> One followup here:
>
>
>
> On Thu, Feb 21, 2008 at 3:54 AM, hemant <gethemant at gmail.com>
wrote:
> > Hi Greg,
> >
> >
> >
> >
> > On Thu, Feb 21, 2008 at 6:33 AM, Greg Campbell <gtcampbell at
gmail.com>
> wrote:
> > > Hi all,
> > >
> > > I''m using a backgroundrb worker for processing data
reporting tasks
> which
> > > can be initiated by users of my rails application, and which need
to
> support
> > > status monitoring.  I had been spawning a new instance with a new
job_id
> for
> > > each task, and reporting/requesting status via that job_id.  It
appears
> that
> > > this sort of thing may be better handled by thread_pool, but
there seem
> to
> > > be two ways of dealing with status reporting, and I''m
curious whether
> people
> > > have found one to be preferable over the other:
> > >
> > > I could track status in the database, as I''m creating a
new row for each
> > > task anyway to store the results, or I could use register_status,
with a
> > > hash keyed on the equivalent of job_id (inside a mutex, as
suggested in
> the
> > > README).  Is there any reason to prefer the second over the
first?
> > > Alternately, am I incorrect in assuming that thread_pool is
preferable
> to
> > > spawning one worker per user request?
> > >
> >
> > thread_pool is definitely preferable over one worker per request
approach.
> >
> > Also, usually register_status is faster than your hand rolled approach
> > of using databases. Also, if worker status results can be stored in
> > memcache clusters as well hence is preferable.
> >
>
> Things seem to be working with thread_pool, except for one issue -
> ask_status returns something incorrect the first time it''s called
after an
> ask_work call.  It looks like the first ask_status response is the return
> value from the worker method that''s calling thread_pool.defer,
when I would
> think that return value should be irrelevant (as the worker''s
being invoked
> with the non-blocking ask_work).  Has anyone seen this behavior before?
>
>   For reference, my code basically works this way (with all app-specific
> stuff removed):
>
> (controller)
> def initiate_task
>   @task = Task.create
>   MiddleMan.ask_work(:worker => :threaded_worker, :worker_method =>
> :process_task, :data => @task.id)
>  end
>
> #called via AJAX polling to update progress bar
> def task_status
>   @task_id = params[:task_id].to_i
>   status_hash = MiddleMan.ask_status(:worker => :threaded_worker)
>   #do something with status_hash[@task_id]...
>  end
>
> (worker)
> def create
>   @worker_status = {}
>   @status_lock = Mutex.new
>   register_status(@worker_status)
> end
>
> def process_task(task_id)
>   thread_pool.defer(task_id) do |task_id|
>     #do several things which call update_status...
>    end
>   return {:this_should_be => :irrelevant}
> end
>
> def update_status(task_id, status)
>   @status_mutex.synchronize do
>     @worker_status[task_id] = status
>   end
>   register_status(@worker_status)
>  end
>
>
> Based on my logging in the controller, the first time task_status is
called,
> the status_hash retrieved looks something like this: {:type =>
:response,
> :data => {:this_should_be => :irrelevant}, :client_signature =>
11}.  The
> next time, however, it looks correct: {(task_id_1) => (task_status_1),
> (task_id_2) => (task_status_2)}, etc.  Again, has anyone seen this sort
of
> thing before?  Am I using the API incorrectly?
Thanks for the bug report. I was able to reproduce this and hence I
fixed it in trunk ( thats been up on git for a while now ).

Get the code using:

   git clone git://gitorious.org/backgroundrb/mainline.git backgroundrb

and follow the README as usual. You will need to install "packet" and
"chronic" gems.

Backgroundrb devel - Feb 2008 - Registering status for multithreaded worker?

[Backgroundrb-devel] Registering status for multithreaded worker?

[Backgroundrb-devel] Registering status for multithreaded worker?

[Backgroundrb-devel] Registering status for multithreaded worker?

[Backgroundrb-devel] Registering status for multithreaded worker?