Ezra,
I agree w/ you on pretty much all points. The current version is doing
what I need it to do, but all the thread/mutex juggling is making it
much slower/ more complex/ more flaky than it needs to be (especially w/
my queue_worker).
Plus doing compute intensive tasks don''t (to my knowledge) take
advantage of SMP setups.
I''ve come to realize that ruby doesn''t really cut it for my
compute
tasks, so I''m biting the bullet and migrating that part to pure C. DRb2
sounds like a perfect fit. I''m looking forward to checking out your new
code at that point for the process based system. I learned most of what
I know about threading starting from your original code, so hey, why not
for process control too? :)
It''ll be at least a week before I get to it though. I''m in the
same boat
as you. Have to prioritize getting the $$ rolling in before making
things perfect. Sigh...
Yeah, loading all of rails is a huge waste of memory. I''m thinking
along
the lines of a checklist of rails pieces for each worker class that can
be individually enabled/disabled. ActiveRecord, if enabled, would be fed
a sublist of models to be loaded.
Anyways, it''ll get there...
Thanks a bunch for all the hard work Ezra,
David Lemstra
Ezra Zygmuntowicz wrote:> Hey there bdrb''ers!
>
> I have been so very busy with http://engineyard.com that I have not
> had time to complete the new release of backgroundrb. But I have done
> substantial work towards it. It is basically an entirely new beast
> with a complete rethinking of how it works. I have put up a zip file
> of the new code base in the hopes that some folks will be able to
> start playing with it and work with me to finish it up so people can
> use it. So please if you feel so inclined, download this new version
> and have a look over the code.
>
> http://brainspl.at/backgroundrb.zip
>
> Ok here is the run down on new features and what still needs to be
> done to finish this up to be used.
>
> ? I have ditched the old start/stop scripts and would like to use the
> daemons gem to manage the daemoon as it is more robust then a hand
> rolled solution. So new start/stop and service scripts will need to
> be written using the daemons gem
>
> ? The new plugin will depend on the slave gem so make sure to install
> the latest gem install slave
>
> ? The new version will be multi process instead of single process.
> The middleman is still threaded but in a different way. It now has a
> ThreadPool class that you can set the size limit on. This allows for
> you to control how many processes get spawned at once. This makes
> bdrb infinitely more scalable.
>
> ? The MiddleMan class tried to do way too much in the current
> version. This violates a bunch of OOP rules and I decided to simplify
> it. So nwo the middleman just manages a thread pool of worker slaves
> and thats pretty much all it does. Look in the middleman_slave.rb
> file for the new middleman class.
>
> ? People liked using timed workers. Unfortunately the current impl of
> timing is sorely lacking. It works ok for very simple stuff but
> people have issues with it running more then one instance of the
> timed worker at once. This is no good. So I have created a new
> scheduler class. This is solely dedicated to scheduling jobs. There
> is a Trigger class that allows for simple repeat every style workers.
> But there is also a new CronTrigger class that can fire off events
> using cron copmpatable syntax! This means you can use cron jobs like
> ''1 * * * * * *'' or even ''28/5,59 1-4,6,20 */1 *
5,0/1 * *'' . this
> allows for a very flexible scheduler. I borrowed some of this code
> from the moment gem and integrated it with bdrb. Have a look at the
> scheduler.rb file and the scheduler test as well.
>
> ? All of the new code is in the bdrb directory in the plugin. If you
> want to run the new middleman_slave and see it manage multiple
> processes at once you can run ruby middleman_slave.rb in one terminal
> and then in irb here are a few commands to play with it:
>
> require ''drb''
> DRb.start_service(''druby://localhost:0'')
> m=DRbObject.new(nil, "druby://localhost:2000")
> 40.times{m.new_worker :class => :worker}
> m.jobs.keys
> m.jobs.keys.each {|o| p m.worker(o).progress}
> m.jobs.keys.each {|o| m[o].shutdown if m.worker(o).done}
>
>
> Now that we use the slave gem to manage external workers there is a
> slight interface change to the middleman. It works the same exact way
> to start workers with new_worker. But when you want to get a handle
> on your worker to call methods you need to use MiddleMan.worker
> (key).method . get_worker or MiddleMan[key] get you a handle on the
> slave process so you can call shutdown on it and get its PID and all
> that. But worker(key) gets you the handle on your actual worker object.
>
>
> So thats my notes for you for right now. Please feel free to write
> in with any questions. The main thing that would help get it back of
> the ground would be the daemons start stop scripts. And I need to
> make a new config file format and parser as well as a nice interface
> where you define timers.
>
> **IMPORTANT NOTE**
>
> Now that we are using multi process workers it is very important
> that you do not require all of rails by default. Of course some
> workers will need this but it would be best to spawn one of these up
> and re use it somehow.
>
> So we need to keep the drb server clean slim as just a manager. So
> each worker class will need to require what it needs for db
> connection and other stuff. This needs some more thought given to it.
> But the current single threaded bdrb is a mess, it opens way too many
> db handles and threading is not enough to handle a ton of workers.
> This new way of doing things will require a bit more work when
> building your worker classes, but the payoff will be huge and it will
> allow you to scale to real high traffic sites now. So anyone with
> ideas about this please pipe up. I think I will make a script you can
> require in your workers that will setup Active record for you. But I
> think that we should not be requiring all of rails it is so inefficient.
>
>
> I know this last point may make it harder for some folks to make
> workers easily because all of rails won''t be available. But this
was
> necessary in order to make bdrb truly scalable. Running an entire
> rails process just to run a worker class is a huge waste. So this
> will need to be worked out as we start to use the new plugin.
>
> I want to thank everyone for their patience, I have been so busy I
> haven''t had time to finish this up. Any help from you guys would
be
> very appreciated.
>
>
> Thoughts?
>
> -Ezra