thr3ads.net - Backgroundrb devel - [Backgroundrb-devel] Please comment on upcoming changes of backgroundrb [Jun 2008]

If this information is useful, please help other people find it:
Share via:

hemant

2008-Jun-18 20:44 UTC

[Backgroundrb-devel] Please comment on upcoming changes of backgroundrb

Folks,

I am getting ready for a new release of BackgrounDRb. It will be
mostly tagging of git release which is being used in production. Not
to mention, Packet and BackgrounDRb has seen a _lot_ of improvement
and fixes since last release.

Also, there are few changes that I want to introduce, please comment on them:

1. As posted earlier - A way of running threads inside backgroundrb
and yet able to use result saving and stuff like that easily.
Currently method is named fetch_parallely which i am planning to
rename to run_concurrently. #defer stays as it is.

http://gnufied.org/2008/06/12/unthreaded-threads-of-hobbiton/

2. Ability to cluster and connect to multiple backgroundrb servers. It
involves some additions to backgroundrb.yml like:

# following section is totally optional, and only useful if you are
trying to cluster of backgroundrb server
# if you do not specify this section backgroundrb will assume that,
from rails you are connecting to the
# backgroundrb server which has been specified in previous section
:client:
:drb_servers: "10.0.0.1:11006,10.0.0.2:11007"

So, when you say:

MiddleMan.worker(:hello_worker).fooo(@user)

BackgrounDRb will delegate the task to your backgroundrb servers in
round robin manner. By using:

MiddleMan.worker(:hello_worker)

you can one specific instance of worker which is tied to one
particular server. Also, #new_worker will be working a round robin
manner, but you must call #delete on returned object.

3. With clustering, there comes the question of storing worker
results. So far, backgroundrb result storage has been a bit of
problem. In new version, I am planning to rename ask_status to
ask_result. Also, register_status will be deprecated and for storing
results, you will have to use:

cache[some_key] = result

where cache is local cache in your worker. note that, here you are
defining a key for your result. But in memory process storage of
results won''t work if backgroundrb servers are clustered and hence,
you will have to use memcache based storage if you are going to
cluster your workers. mechanism will be the same, you will specify the
key which will be combined with worker_name and worker_key.

Also, job_key wherever used will be replaced with worker_key since I
find that name confusing.

Thats all folks. Please try git version. Stress test it and let me
know about any problems.

http://gnufied.org/2008/05/21/bleeding-edge-version-of-backgroundrb-for-better-memory-usage/

--
Let them talk of their oriental summer climes of everlasting
conservatories; give me the privilege of making my own summer with my
own coals.

http://gnufied.org

Stevie Clifton

2008-Jun-20 15:22 UTC

head link

[Backgroundrb-devel] Please comment on upcoming changes of backgroundrb

Hey Hemant,

This is great.  Thanks for putting the effort to getting the git
version tagged so people feel comfortable using it in production!  I''m
assuming when you say that you''ll tag the git release you''ll
also push
out new bdrd and packet gems, correct?

I had a quick question about fetch_parallely/run_concurrently.  Is it
basically doing the same thing as defer(), but just with a thread-safe
callback?  If so, it might be more intuitive to make the name of the
method reflect that, such as "defer_concurrently" so as not to make
people think that they do completely different things.

Also, if the above is true (run_concurrently is a thread-safe version
of defer), I think it would be less cumbersome if you just extended
defer() to allow the same functionality by passing the name of a
callback method in an options hash instead of creating an entirely new
method.  This way people could just use the block passed to defer() as
the request_proc, and another method at the worker level for the
response_proc instead of creating a bunch of procs to pass to the
method.  For example, instead of:

run_concurrently(args, request_proc, response_proc)

You could do:

def example_worker(args)
  defer(args, :callback => :my_callback_method) |args|
    # same as the body of request_proc
  end
end

def my_callback_method(result)
  # same as body of response_proc (call register_status or whatever)
end

I looked through the code in git for fetch_parallely, and I''m not sure
if this would give you the same flexibility, but it would be a more
intuitive solution to a user IMHO.

stevie

On Wed, Jun 18, 2008 at 4:44 PM, hemant <gethemant at gmail.com>
wrote:> Folks,
>
> I am getting ready for a new release of BackgrounDRb. It will be
> mostly tagging of git release which is being used in production. Not
> to mention, Packet and BackgrounDRb has seen a _lot_ of improvement
> and fixes since last release.
>
> Also, there are few changes that I want to introduce, please comment on
them:
>
> 1. As posted earlier - A way of running threads inside backgroundrb
> and yet able to use result saving and stuff like that easily.
> Currently method is named fetch_parallely which i am planning to
> rename to run_concurrently. #defer stays as it is.
>
> http://gnufied.org/2008/06/12/unthreaded-threads-of-hobbiton/
>
> 2. Ability to cluster and connect to multiple backgroundrb servers. It
> involves some additions to backgroundrb.yml like:
>
> # following section is totally optional, and only useful if you are
> trying to cluster of backgroundrb server
> # if you do not specify this section backgroundrb will assume that,
> from rails you are connecting to the
> # backgroundrb server which has been specified in previous section
> :client:
>  :drb_servers: "10.0.0.1:11006,10.0.0.2:11007"
>
> So, when you say:
>
> MiddleMan.worker(:hello_worker).fooo(@user)
>
> BackgrounDRb will delegate the task to your backgroundrb servers in
> round robin manner. By using:
>
> MiddleMan.worker(:hello_worker)
>
> you can one specific instance of worker which is tied to one
> particular server. Also, #new_worker will be working a round robin
> manner, but you must call #delete on returned object.
>
> 3. With clustering, there comes the question of storing worker
> results. So far, backgroundrb result storage has been a bit of
> problem. In new version, I am planning to rename ask_status to
> ask_result. Also, register_status will be deprecated and for storing
> results, you will have to use:
>
> cache[some_key]  = result
>
> where cache is local cache in your worker.  note that, here you are
> defining a key for your result. But in memory process storage of
> results won''t work if backgroundrb servers are clustered and
hence,
> you will have to use memcache based storage if you are going to
> cluster your workers. mechanism will be the same, you will specify the
> key which will be combined with worker_name and worker_key.
>
> Also, job_key wherever used will be replaced with worker_key since I
> find that name confusing.
>
> Thats all folks. Please try git version. Stress test it and let me
> know about any problems.
>
>
http://gnufied.org/2008/05/21/bleeding-edge-version-of-backgroundrb-for-better-memory-usage/
>
>
> --
> Let them talk of their oriental summer climes of everlasting
> conservatories; give me the privilege of making my own summer with my
> own coals.
>
> http://gnufied.org
> _______________________________________________
> Backgroundrb-devel mailing list
> Backgroundrb-devel at rubyforge.org
> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>

hemant

2008-Jun-20 17:26 UTC

head link

[Backgroundrb-devel] Please comment on upcoming changes of backgroundrb

On Fri, Jun 20, 2008 at 8:52 PM, Stevie Clifton <stevie at
slowbicycle.com> wrote:> Hey Hemant,
>
> This is great.  Thanks for putting the effort to getting the git
> version tagged so people feel comfortable using it in production! 
I''m
> assuming when you say that you''ll tag the git release
you''ll also push
> out new bdrd and packet gems, correct?
>
> I had a quick question about fetch_parallely/run_concurrently.  Is it
> basically doing the same thing as defer(), but just with a thread-safe
> callback?  If so, it might be more intuitive to make the name of the
> method reflect that, such as "defer_concurrently" so as not to
make
> people think that they do completely different things.
>
> Also, if the above is true (run_concurrently is a thread-safe version
> of defer), I think it would be less cumbersome if you just extended
> defer() to allow the same functionality by passing the name of a
> callback method in an options hash instead of creating an entirely new
> method.  This way people could just use the block passed to defer() as
> the request_proc, and another method at the worker level for the
> response_proc instead of creating a bunch of procs to pass to the
> method.  For example, instead of:
>
> run_concurrently(args, request_proc, response_proc)
>
> You could do:
>
> def example_worker(args)
>  defer(args, :callback => :my_callback_method) |args|
>    # same as the body of request_proc
>  end
> end
>
> def my_callback_method(result)
>  # same as body of response_proc (call register_status or whatever)
> end
>
> I looked through the code in git for fetch_parallely, and I''m not
sure
> if this would give you the same flexibility, but it would be a more
> intuitive solution to a user IMHO.
>
Well, thanks for looking up. Actually I made quite bit of changes and
pushed an update to git on testcase branch.

http://github.com/gnufied/backgroundrb/commits/testcase

Well, I take your advice for run_concurrent method. But I have made
result storage completely thread safe and hence user can call it from
threads without any worries.

Actually, I made quite a bit of API changes today:

Here is a sample API, if you try new branch:

http://pastie.org/218967

Or

# Run a task like:

# this is our dear remote process
class HelloWorker
  set_worker_name :hello_worker
  def barbar t_user
    # runs method some_task inside thread pool
    thread_pool.defer(t_user,method(:some_task))
  end

  def some_task user
    user_feeds = user.feeds
    loop do
      # user can retrieve/add/edit objects from result/cache without
worrying about thread safety
      # when you call job_key in your threads it automatically
resolves to job_key for the task being executed
      # there may be another task being executed in another thread,
but since job_key is Thread local variable
      # it will always resolve to the correct one.
      old_counter = cache[job_key]
      cache[job_key] += 10
    end
  end
end

# invoke tasks/methods in worker async
MiddleMan.worker(:hello_worker).async_barbar(<some_job_key>, at user)

# invoke method in worker sync
MiddleMan.worker(:hello_worker).barbar(<some_job_key>, at user)

# ask the result object stored with job_key "wow"
MiddleMan.worker(:hello_worker).ask_result(<job_key_or_key_with_which_you_stored_result>)
# If you are doing this in production, i will strongly advise to use
mecache for result storage


# there is an issue if user invokes multiple tasks in thread pool
directly from one of the worker
# under current settings they are going to end up with same job key

# Also, new_worker can''t have same method invocation conventions
because it accepts more parameters.


Again, MiddleMan wraps basically a cluster of bdrb servers now and when you say:

MiddleMan.worker(:hello_worker).async_barbar(<some_job_key>, at user)

It will be invoked in a round robin manner in all the bdrb servers
specified in configuration file. Complete configuration file looks
like:

# A Sample YAML configuration file
---
:backgroundrb:
  :ip: 0.0.0.0 #ip on which backgroundrb server is running
  :port: 11006 #port on which backgroundrb server is running
  :environment: production # rails environment loaded, defaults to development
  :debug_log: true # whether to print debug logs to a seperate worker,
defaults to true
  :log: foreground # will print log messages to STDOUT, defaults to
seperate log worker
  :result_storage: memcache # store results in a mecache cluster, you
also need to specify location of your memcache clusters in next
section

:memcache: "10.0.0.1:11211,10.0.0.2:11211" #=> location of mecache
clusters seperated by comma

# following section is totally optional, and only useful if you are
trying to cluster of backgroundrb server
# if you do not specify this section backgroundrb will assume that,
from rails you are connecting to the
# backgroundrb server which has been specified in previous section
:client:
  :drb_servers: "10.0.0.1:11006,10.0.0.2:11007"

# You specify your worker schedules here
:schedules:
  :foo_worker: # worker name
    :barbar: #worker method
      :trigger_args: */5 * * * * * * #worker schedule



Please comment on this.

Backgroundrb devel - Jun 2008 - Please comment on upcoming changes of backgroundrb

[Backgroundrb-devel] Please comment on upcoming changes of backgroundrb

[Backgroundrb-devel] Please comment on upcoming changes of backgroundrb

[Backgroundrb-devel] Please comment on upcoming changes of backgroundrb