Robert Bjarnason
2007-Jan-04 16:44 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
Hi,
I''m using backgroundrb 0.2.1 in a production environment and for most
parts I''m very happy. We are using it to do some heavy video editing
on
the server side and it works great except that under, what seems heavy
load the below problem happens intermittently.
The problem has only happened 5 times out of over 500 runs by our
backgroundrb worker.
This is the code in our worker:
logger.debug("info : progress: #{progress}")
progress_percent = progress * 100
if progress_percent >= 100
results[:progress] = 99.99
else
results[:progress] = progress_percent <-- Line of the crash
end
Here is the error:
can''t convert Float into Hash
And the beginning of the stack trace:
/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40
:in
`merge!''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40
:in `[]=''/.../ContentStore/lib/workers/content_worker.rb:40
:in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:22:
in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:21
:in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:181
:in
`do_work''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/worker.rb:55:
...
The problem seems to happen only under heavy load where more than 1
worker process is active at the same time.
Any ideas or leads?
Thanks,
Robert Bjarnason
Bob Hutchison
2007-Jan-06 13:13 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
Hi, This sounds a bit like the trouble in OS X where you''d get a Fixednum to String error. This was caused by a bug in the gcc 4.0 compiler. The solution on OS X is to re-compile Ruby with -O1 optimisation level or switch to the gcc 3.x compiler). As I understand it, with linux you have the additional option of installing a newer version of gcc 4.x and recompiling (I''m using 4.0.2 on one of my linux boxes and have never seen the problem). Cheers, Bob On 4-Jan-07, at 11:44 AM, Robert Bjarnason wrote:> Hi, > > I''m using backgroundrb 0.2.1 in a production environment and for most > parts I''m very happy. We are using it to do some heavy video > editing on > the server side and it works great except that under, what seems heavy > load the below problem happens intermittently. > > The problem has only happened 5 times out of over 500 runs by our > backgroundrb worker. > > This is the code in our worker: > logger.debug("info : progress: #{progress}") > progress_percent = progress * 100 > if progress_percent >= 100 > results[:progress] = 99.99 > else > results[:progress] = progress_percent <-- Line of > the crash > end > > Here is the error: > can''t convert Float into Hash > > And the beginning of the stack trace: > /.../ContentStore/vendor/plugins/backgroundrb/server/lib/ > backgroundrb/results.rb:40 > :in > `merge!''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/ > backgroundrb/results.rb:40 > :in `[]=''/.../ContentStore/lib/workers/content_worker.rb:40 > :in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb: > 22: > in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:21 > :in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb: > 181 > :in > `do_work''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/ > backgroundrb/worker.rb:55: > ... > > The problem seems to happen only under heavy load where more than 1 > worker process is active at the same time. > > Any ideas or leads? > > Thanks, > Robert Bjarnason > > _______________________________________________ > Backgroundrb-devel mailing list > Backgroundrb-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/backgroundrb-devel---- Bob Hutchison -- blogs at <http://www.recursive.ca/ hutch/> Recursive Design Inc. -- <http://www.recursive.ca/> Raconteur -- <http://www.raconteur.info/> xampl for Ruby -- <http://rubyforge.org/projects/xampl/>
Robert Bjarnason
2007-Jan-07 07:54 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
Hi Bob, Thanks for the pointer, I agree this is probably not a directly backgroundrb related problem. I built Ruby 1.8.4 on a Debian Linux box using gcc version 3.3.5. Warm regards, Robert Bjarnason Bob Hutchison wrote:> Hi, > > This sounds a bit like the trouble in OS X where you''d get a Fixednum > to String error. This was caused by a bug in the gcc 4.0 compiler. The > solution on OS X is to re-compile Ruby with -O1 optimisation level or > switch to the gcc 3.x compiler). As I understand it, with linux you > have the additional option of installing a newer version of gcc 4.x > and recompiling (I''m using 4.0.2 on one of my linux boxes and have > never seen the problem). > > Cheers, > Bob > > On 4-Jan-07, at 11:44 AM, Robert Bjarnason wrote: > >> Hi, >> >> I''m using backgroundrb 0.2.1 in a production environment and for most >> parts I''m very happy. We are using it to do some heavy video editing on >> the server side and it works great except that under, what seems heavy >> load the below problem happens intermittently. >> >> The problem has only happened 5 times out of over 500 runs by our >> backgroundrb worker. >> >> This is the code in our worker: >> logger.debug("info : progress: #{progress}") >> progress_percent = progress * 100 >> if progress_percent >= 100 >> results[:progress] = 99.99 >> else >> results[:progress] = progress_percent <-- Line of the >> crash >> end >> >> Here is the error: >> can''t convert Float into Hash >> >> And the beginning of the stack trace: >> /.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 >> >> :in >> `merge!''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 >> >> :in `[]=''/.../ContentStore/lib/workers/content_worker.rb:40 >> :in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:22: >> in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:21 >> :in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:181 >> :in >> `do_work''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/worker.rb:55: >> >> ... >> >> The problem seems to happen only under heavy load where more than 1 >> worker process is active at the same time. >> >> Any ideas or leads? >> >> Thanks, >> Robert Bjarnason >> >> _______________________________________________ >> Backgroundrb-devel mailing list >> Backgroundrb-devel at rubyforge.org >> http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > ---- > Bob Hutchison -- blogs at > <http://www.recursive.ca/hutch/> > Recursive Design Inc. -- <http://www.recursive.ca/> > Raconteur -- <http://www.raconteur.info/> > xampl for Ruby -- <http://rubyforge.org/projects/xampl/> > > > >
Mason Hale
2007-Jan-09 20:54 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
I''m getting a similar error, here a partial stack trace: 20070108-20:17:45 (27597) can''t convert Symbol into Hash - (TypeError) 20070108-20:17:45 (27597) /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in `merge!'' 20070108-20:17:45 (27597) /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in `[]='' The process will run fine for hours or days, but then will stop with this error. When it happens it takes now the entire BackgrounDRb server. (I''m addding some begin/rescue blocks to hopefully prevent that.) I''ve got Ruby 1.8.5p12 compiled with gcc 3.4.6 on RedHat Linux. This also seems to occur with more than one process (running the same worker class) at the same tim Any ideas? Mason On 1/7/07, Robert Bjarnason <robert.bjarnason at gmail.com> wrote:> > Hi Bob, > > Thanks for the pointer, I agree this is probably not a directly > backgroundrb related problem. I built Ruby 1.8.4 on a Debian Linux box > using gcc version 3.3.5. > > Warm regards, > Robert Bjarnason > > Bob Hutchison wrote: > > Hi, > > > > This sounds a bit like the trouble in OS X where you''d get a Fixednum > > to String error. This was caused by a bug in the gcc 4.0 compiler. The > > solution on OS X is to re-compile Ruby with -O1 optimisation level or > > switch to the gcc 3.x compiler). As I understand it, with linux you > > have the additional option of installing a newer version of gcc 4.x > > and recompiling (I''m using 4.0.2 on one of my linux boxes and have > > never seen the problem). > > > > Cheers, > > Bob > > > > On 4-Jan-07, at 11:44 AM, Robert Bjarnason wrote: > > > >> Hi, > >> > >> I''m using backgroundrb 0.2.1 in a production environment and for most > >> parts I''m very happy. We are using it to do some heavy video editing > on > >> the server side and it works great except that under, what seems heavy > >> load the below problem happens intermittently. > >> > >> The problem has only happened 5 times out of over 500 runs by our > >> backgroundrb worker. > >> > >> This is the code in our worker: > >> logger.debug("info : progress: #{progress}") > >> progress_percent = progress * 100 > >> if progress_percent >= 100 > >> results[:progress] = 99.99 > >> else > >> results[:progress] = progress_percent <-- Line of the > >> crash > >> end > >> > >> Here is the error: > >> can''t convert Float into Hash > >> > >> And the beginning of the stack trace: > >> > /.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > >> > >> :in > >> > `merge!''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > >> > >> :in `[]=''/.../ContentStore/lib/workers/content_worker.rb:40 > >> :in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:22: > >> in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:21 > >> :in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:181 > >> :in > >> > `do_work''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/worker.rb:55: > >> > >> ... > >> > >> The problem seems to happen only under heavy load where more than 1 > >> worker process is active at the same time. > >> > >> Any ideas or leads? > >> > >> Thanks, > >> Robert Bjarnason > >> > >> _______________________________________________ > >> Backgroundrb-devel mailing list > >> Backgroundrb-devel at rubyforge.org > >> http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > > ---- > > Bob Hutchison -- blogs at > > <http://www.recursive.ca/hutch/> > > Recursive Design Inc. -- <http://www.recursive.ca/> > > Raconteur -- <http://www.raconteur.info/> > > xampl for Ruby -- <http://rubyforge.org/projects/xampl/> > > > > > > > > > > _______________________________________________ > Backgroundrb-devel mailing list > Backgroundrb-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/backgroundrb-devel >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20070109/f8c01274/attachment.html
Robert Bjarnason
2007-Jan-10 03:21 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
Hi Mason, I''m still seeing the same problem with Float but in my case the backgroundrb server keeps on running fine and as people can retry on the client side this is not a blocker problem for us at the moment. Compiler bugs are now less likely as you are using a different version of Ruby, GCC and a different Linux distribution. Maybe the lead here is the fact that in both our cases more than one backgroundrb process is running when the crash happens, this should make it easy to create a test case to be able to replicate the problem to this - unfortunately I''ve only had the pleasure of working with Ruby for a couple of months so don''t think I have the expertise needed to debug this problem. As I understand backgroundrb then there would be two completely isolated Ruby VMs running our code so maybe this is a backgroundrb problem after all? Warm regards, Robert Bjarnason Mason Hale wrote:> I''m getting a similar error, here a partial stack trace: > > 20070108-20:17:45 (27597) can''t convert Symbol into Hash - (TypeError) > 20070108-20:17:45 (27597) > /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in > `merge!'' > 20070108-20:17:45 (27597) > /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in > `[]='' > > The process will run fine for hours or days, but then will stop with > this error. When it happens it takes now the entire BackgrounDRb > server. (I''m addding some begin/rescue blocks to hopefully prevent that.) > > I''ve got Ruby 1.8.5p12 compiled with gcc 3.4.6 on RedHat Linux. > > This also seems to occur with more than one process (running the same > worker class) at the same tim > > Any ideas? > > Mason > > On 1/7/07, *Robert Bjarnason* <robert.bjarnason at gmail.com > <mailto:robert.bjarnason at gmail.com>> wrote: > > Hi Bob, > > Thanks for the pointer, I agree this is probably not a directly > backgroundrb related problem. I built Ruby 1.8.4 on a Debian > Linux box > using gcc version 3.3.5. > > Warm regards, > Robert Bjarnason > > Bob Hutchison wrote: > > Hi, > > > > This sounds a bit like the trouble in OS X where you''d get a > Fixednum > > to String error. This was caused by a bug in the gcc 4.0 > compiler. The > > solution on OS X is to re-compile Ruby with -O1 optimisation > level or > > switch to the gcc 3.x compiler). As I understand it, with linux you > > have the additional option of installing a newer version of gcc 4.x > > and recompiling (I''m using 4.0.2 on one of my linux boxes and have > > never seen the problem). > > > > Cheers, > > Bob > > > > On 4-Jan-07, at 11:44 AM, Robert Bjarnason wrote: > > > >> Hi, > >> > >> I''m using backgroundrb 0.2.1 in a production environment and > for most > >> parts I''m very happy. We are using it to do some heavy video > editing on > >> the server side and it works great except that under, what > seems heavy > >> load the below problem happens intermittently. > >> > >> The problem has only happened 5 times out of over 500 runs by our > >> backgroundrb worker. > >> > >> This is the code in our worker: > >> logger.debug("info : progress: #{progress}") > >> progress_percent = progress * 100 > >> if progress_percent >= 100 > >> results[:progress] = 99.99 > >> else > >> results[:progress] = progress_percent <-- Line > of the > >> crash > >> end > >> > >> Here is the error: > >> can''t convert Float into Hash > >> > >> And the beginning of the stack trace: > >> > /.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > > >> > >> :in > >> > `merge!''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > >> > >> :in `[]=''/.../ContentStore/lib/workers/content_worker.rb:40 > >> :in > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:22: > >> in > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:21 > >> :in > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:181 > >> :in > >> > `do_work''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/worker.rb:55: > >> > >> ... > >> > >> The problem seems to happen only under heavy load where more > than 1 > >> worker process is active at the same time. > >> > >> Any ideas or leads? > >> > >> Thanks, > >> Robert Bjarnason > >> > >> _______________________________________________ > >> Backgroundrb-devel mailing list > >> Backgroundrb-devel at rubyforge.org > <mailto:Backgroundrb-devel at rubyforge.org> > >> http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > > ---- > > Bob Hutchison -- blogs at > > <http://www.recursive.ca/hutch/> > > Recursive Design Inc. -- <http://www.recursive.ca/> > > Raconteur -- <http://www.raconteur.info/ > <http://www.raconteur.info/>> > > xampl for Ruby -- > <http://rubyforge.org/projects/xampl/> > > > > > > > > > > _______________________________________________ > Backgroundrb-devel mailing list > Backgroundrb-devel at rubyforge.org > <mailto:Backgroundrb-devel at rubyforge.org> > http://rubyforge.org/mailman/listinfo/backgroundrb-devel > <http://rubyforge.org/mailman/listinfo/backgroundrb-devel> > >
skaar
2007-Jan-10 13:05 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
It might be that we have to introduce a mutex in the results worker where this happens. I''ll try to get this reproduced sometime this weekend. /skaar * Mason Hale (masonhale at gmail.com) [070109 14:42]:> I''m getting a similar error, here a partial stack trace: > > 20070108-20:17:45 (27597) can''t convert Symbol into Hash - (TypeError) > 20070108-20:17:45 (27597) > /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in > `merge!'' > 20070108-20:17:45 (27597) > /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in > `[]='' > > The process will run fine for hours or days, but then will stop with this > error. When it happens it takes now the entire BackgrounDRb server. (I''m > addding some begin/rescue blocks to hopefully prevent that.) > > I''ve got Ruby 1.8.5p12 compiled with gcc 3.4.6 on RedHat Linux. > > This also seems to occur with more than one process (running the same > worker class) at the same tim > > Any ideas? > > Mason > > On 1/7/07, Robert Bjarnason <[1]robert.bjarnason at gmail.com> wrote: > > Hi Bob, > > Thanks for the pointer, I agree this is probably not a directly > backgroundrb related problem. I built Ruby 1.8.4 on a Debian Linux box > using gcc version 3.3.5. > > Warm regards, > Robert Bjarnason > > Bob Hutchison wrote: > > Hi, > > > > This sounds a bit like the trouble in OS X where you''d get a Fixednum > > to String error. This was caused by a bug in the gcc 4.0 compiler. The > > solution on OS X is to re-compile Ruby with -O1 optimisation level or > > switch to the gcc 3.x compiler). As I understand it, with linux you > > have the additional option of installing a newer version of gcc 4.x > > and recompiling (I''m using 4.0.2 on one of my linux boxes and have > > never seen the problem). > > > > Cheers, > > Bob > > > > On 4-Jan-07, at 11:44 AM, Robert Bjarnason wrote: > > > >> Hi, > >> > >> I''m using backgroundrb 0.2.1 in a production environment and for most > >> parts I''m very happy. We are using it to do some heavy video editing > on > >> the server side and it works great except that under, what seems > heavy > >> load the below problem happens intermittently. > >> > >> The problem has only happened 5 times out of over 500 runs by our > >> backgroundrb worker. > >> > >> This is the code in our worker: > >> logger.debug("info : progress: #{progress}") > >> progress_percent = progress * 100 > >> if progress_percent >= 100 > >> results[:progress] = 99.99 > >> else > >> results[:progress] = progress_percent <-- Line of the > >> crash > >> end > >> > >> Here is the error: > >> can''t convert Float into Hash > >> > >> And the beginning of the stack trace: > >> > /.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > >> > >> :in > >> > `merge!''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > >> > >> :in `[]=''/.../ContentStore/lib/workers/content_worker.rb:40 > >> :in > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:22: > >> in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:21 > >> :in > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:181 > >> :in > >> > `do_work''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/worker.rb:55: > >> > >> ... > >> > >> The problem seems to happen only under heavy load where more than 1 > >> worker process is active at the same time. > >> > >> Any ideas or leads? > >> > >> Thanks, > >> Robert Bjarnason > >> > >> _______________________________________________ > >> Backgroundrb-devel mailing list > >> [2]Backgroundrb-devel at rubyforge.org > >> [3]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > > ---- > > Bob Hutchison -- blogs at > > <[4]http://www.recursive.ca/hutch/> > > Recursive Design Inc. -- <[5]http://www.recursive.ca/> > > Raconteur -- <[6]http://www.raconteur.info/ > > > xampl for Ruby -- > <[7]http://rubyforge.org/projects/xampl/> > > > > > > > > > > _______________________________________________ > Backgroundrb-devel mailing list > [8]Backgroundrb-devel at rubyforge.org > [9]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > References > > Visible links > 1. mailto:robert.bjarnason at gmail.com > 2. mailto:Backgroundrb-devel at rubyforge.org > 3. http://rubyforge.org/mailman/listinfo/backgroundrb-devel > 4. http://www.recursive.ca/hutch/ > 5. http://www.recursive.ca/ > 6. http://www.raconteur.info/ > 7. http://rubyforge.org/projects/xampl/ > 8. mailto:Backgroundrb-devel at rubyforge.org > 9. http://rubyforge.org/mailman/listinfo/backgroundrb-devel> _______________________________________________ > Backgroundrb-devel mailing list > Backgroundrb-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/backgroundrb-devel-- ---------------------------------------------------------------------- |\|\ where in the | s_u_b_s_t_r_u_c_t_i_o_n | | >=========== W.A.S.T.E. | genarratologies |/|/ (_) is the wisdom | skaar at waste.org ----------------------------------------------------------------------
Mason Hale
2007-Jan-10 19:39 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
I''ve done some more work on this and have created a test case that
reliably
throws errors, although the errors themselves are not consistent.
About 1 out of every 4 times, I get the "can''t convert Symbol to
Hash" error
in server/lib/backgroundrb/results.rb:40 in ''merge!''.
I created the following worker class is
{RAILS_ROOT}/lib/workers/results_test_worker.rb
# This class repeatedly writes values to the results, to
# test the results process
class ResultsTestWorker < BackgrounDRb::Worker::RailsBase
def do_work(args)
logger.info "Started ResultsTestWorker"
results[:started_at] = Time.now
args ||= {}
limit = args[:limit] || 10_000
logger.info "Limit is #{limit}"
limit.times do |i|
results[:last_update] = Time.now
results[:counter] = i
end
stop_time = Time.now
logger.info "Stopped ResultsTestWorker at #{stop_time}"
results[:stopped_at] = stop_time
self.delete
end
end
ResultsTestWorker.register
Then in {RAILS_ROOT}/test/unit/drb_results_test.rb I have:
require File.dirname(__FILE__) + ''/../test_helper''
class DrbResultsTest < Test::Unit::TestCase
def setup
# start backgroundrb server
`../../script/backgroundrb start`
sleep 5 # give it time to startup
end
def teardown
# stop backgroundrb server
`../../script/backgroundrb stop`
end
def test_results
limit = 10
keys = []
4.times do |i|
job_key = "#{self.class.name}_#{i}"
keys << job_key
MiddleMan.new_worker(:class => :results_test_worker, :job_key =>
job_key, :args => {:limit => limit})
end
sleep 2 # wait for workers to finish
keys.each_with_index do |k, i|
assert_not_nil MiddleMan[k], "checking job_key #{k} on iteration
#{i}"
assert_not_nil MiddleMan[k].object, "checking object on iteration
#{i}"
assert_not_nil MiddleMan[k].object.results, "checking results on
iteration #{i}"
assert_equal(limit - 1, MiddleMan[k].object.results.to_hash[:counter],
"checking counter on iteration #{i}")
end
end
end
This test does the following:
- Spawns 4 results_test_worker processes that each write several values to
the ResultsWorker (in parallel)
Increasing the limit value increases the odds of these processes
concurrently trying to write results at the
same time, but I''ve found that a limit of 10 works pretty well.
- It waits a couple seconds for the workers to finish (is there a better way
to determine if the processes are all done)?
- Then it tries to access the results for each job_key, specifically to
ensure that counter value is equal to limit - 1.
NOTE: I''ve never gotten this test to complete successfully. In addition
to
the "can''t convert Symbol to Hash" error,
I''ve seen the following:
- The [:counter] value is much lower than the expected value. If limit is
10,000 this value might be 246 when 9,999 was expected.
- The job_key is not recognized, the call to MiddleMan[k] returns nil. When
this occurs, I can usually see in the backgroundrb.log
that fewer than 4 workers were actually created. I can see this by
counting the number of "Started ResultsTestWorker"
messages in the log.
- The job_key is resolved, but the call to MiddleMan[k].object.results
returns nil
- The call to MiddleMan.new_worker hangs and never returns
I''m sharing this code so that others can try it out. It''s a
bit of a hack to
get some testing working (starting and stopping the BackgrounDRb server on
each test, having a test worker class in lib/workers, etc.), but it is
self-contained, and replicated the real-world environment of my code running
in rails. It you have suggestions for improving the testing approach
I''m all
ears.
I''m also interested in feedback in the code itself. Maybe I''m
not working
with the MiddleMan object correctly. I have to admit I''m still wrapping
my
head around Drb.
Resolving this issue is critical to my project so I will continue trying to
track things down. I''ll start by adding a mutex to the Results#[]=
method.
Mason
On 1/10/07, skaar <skaar at waste.org> wrote:>
> It might be that we have to introduce a mutex in the results worker
> where this happens. I''ll try to get this reproduced sometime this
> weekend.
>
> /skaar
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20070110/bfd945b4/attachment.html
skaar
2007-Jan-10 21:05 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
I do have a test case that is close to this, which at this point has shown the exception once on a 10K result assignment with 4 workers. (Mason - could you a ticket for this?) * Mason Hale (masonhale at gmail.com) [070110 13:27]:> I''ve done some more work on this and have created a test case that > reliably throws errors, although the errors themselves are not consistent. > About 1 out of every 4 times, I get the "can''t convert Symbol to Hash" > error in server/lib/backgroundrb/results.rb:40 in ''merge!''. > > keys.each_with_index do |k, i| > assert_not_nil MiddleMan[k], "checking job_key #{k} on iteration > #{i}" > assert_not_nil MiddleMan[k].object, "checking object on iteration > #{i}" > assert_not_nil MiddleMan[k].object.results, "checking results on > iteration #{i}" > assert_equal(limit - 1, > MiddleMan[k].object.results.to_hash[:counter], "checking counter on > iteration #{i}") > end > > end > endyou should probably use MiddleMan.worker(k) here, which will benefit from the WorkerProxy (this is an inconsistency that I had overlooked) where you are re-directed directly to the results worker after the worker itself has gone away. so: MiddleMan.worker(k).results also MiddleMan.worker(k).results[:counter] should work as well. Another thing that worries me a little bit is that I see very different completion time. Everything from 10 minutes to almost 1/2 hour - that is with 4 x 10k results. /skaar -- ---------------------------------------------------------------------- |\|\ where in the | s_u_b_s_t_r_u_c_t_i_o_n | | >=========== W.A.S.T.E. | genarratologies |/|/ (_) is the wisdom | skaar at waste.org ----------------------------------------------------------------------
Mason Hale
2007-Jan-10 21:22 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
Thanks skaar. I''ll add a ticket. FYI -- I''m running my tests on a Macbook, OS X 10.4 w/ Ruby 1.8.4 On 1/10/07, skaar <skaar at waste.org> wrote:> > I do have a test case that is close to this, which at this point has > shown the exception once on a 10K result assignment with 4 workers. > > (Mason - could you a ticket for this?) > > * Mason Hale (masonhale at gmail.com) [070110 13:27]: > > I''ve done some more work on this and have created a test case that > > reliably throws errors, although the errors themselves are not > consistent. > > About 1 out of every 4 times, I get the "can''t convert Symbol to > Hash" > > error in server/lib/backgroundrb/results.rb:40 in ''merge!''. > > > > keys.each_with_index do |k, i| > > assert_not_nil MiddleMan[k], "checking job_key #{k} on > iteration > > #{i}" > > assert_not_nil MiddleMan[k].object, "checking object on > iteration > > #{i}" > > assert_not_nil MiddleMan[k].object.results, "checking results > on > > iteration #{i}" > > assert_equal(limit - 1, > > MiddleMan[k].object.results.to_hash[:counter], "checking counter on > > iteration #{i}") > > end > > > > end > > end > > you should probably use MiddleMan.worker(k) here, which will benefit > from the WorkerProxy (this is an inconsistency that I had overlooked) > where you are re-directed directly to the results worker after the > worker itself has gone away. so: > > MiddleMan.worker(k).results > > also > > MiddleMan.worker(k).results[:counter] > > should work as well. > > Another thing that worries me a little bit is that I see very different > completion time. Everything from 10 minutes to almost 1/2 hour - that is > with 4 x 10k results. > > /skaar > > > -- > ---------------------------------------------------------------------- > |\|\ where in the | s_u_b_s_t_r_u_c_t_i_o_n > | | >=========== W.A.S.T.E. | genarratologies > |/|/ (_) is the wisdom | skaar at waste.org > ---------------------------------------------------------------------- >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20070110/426bdb60/attachment-0001.html
Mason Hale
2007-Jan-12 17:57 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
Following up on this item, I found that if I remove any calls to add values to the results_worker hash, the errors described below go away. I also found that in some cases the call to ''BackgrouDRb::Results::stored'' was returning the symbol :backgroundrb_results instead of a hash, thus triggering the "can''t convert Symbol to Hash" TypeError. I made a pass at synchronizing the WorkerResults []= and [] methods via Mutex, but was unsuccessful. I suspect it may have something to do with Mutex not being reentrant. See: http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/24470 http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/172652 In any case, avoiding use of results avoids the issue. For the time being, I''m storing any process state information in a database instead. Mason On 1/10/07, skaar <skaar at waste.org> wrote:> > It might be that we have to introduce a mutex in the results worker > where this happens. I''ll try to get this reproduced sometime this > weekend. > > /skaar > > > * Mason Hale (masonhale at gmail.com) [070109 14:42]: > > I''m getting a similar error, here a partial stack trace: > > > > 20070108-20:17:45 (27597) can''t convert Symbol into Hash - > (TypeError) > > 20070108-20:17:45 (27597) > > > /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in > > `merge!'' > > 20070108-20:17:45 (27597) > > > /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in > > `[]='' > > > > The process will run fine for hours or days, but then will stop with > this > > error. When it happens it takes now the entire BackgrounDRb server. > (I''m > > addding some begin/rescue blocks to hopefully prevent that.) > > > > I''ve got Ruby 1.8.5p12 compiled with gcc 3.4.6 on RedHat Linux. > > > > This also seems to occur with more than one process (running the same > > worker class) at the same tim > > > > Any ideas? > > > > Mason > > > > On 1/7/07, Robert Bjarnason <[1]robert.bjarnason at gmail.com> wrote: > > > > Hi Bob, > > > > Thanks for the pointer, I agree this is probably not a directly > > backgroundrb related problem. I built Ruby 1.8.4 on a Debian Linux > box > > using gcc version 3.3.5. > > > > Warm regards, > > Robert Bjarnason > > > > Bob Hutchison wrote: > > > Hi, > > > > > > This sounds a bit like the trouble in OS X where you''d get a > Fixednum > > > to String error. This was caused by a bug in the gcc 4.0compiler. The > > > solution on OS X is to re-compile Ruby with -O1 optimisation > level or > > > switch to the gcc 3.x compiler). As I understand it, with linux > you > > > have the additional option of installing a newer version of gcc > 4.x > > > and recompiling (I''m using 4.0.2 on one of my linux boxes and > have > > > never seen the problem). > > > > > > Cheers, > > > Bob > > > > > > On 4-Jan-07, at 11:44 AM, Robert Bjarnason wrote: > > > > > >> Hi, > > >> > > >> I''m using backgroundrb 0.2.1 in a production environment and for > most > > >> parts I''m very happy. We are using it to do some heavy video > editing > > on > > >> the server side and it works great except that under, what seems > > heavy > > >> load the below problem happens intermittently. > > >> > > >> The problem has only happened 5 times out of over 500 runs by > our > > >> backgroundrb worker. > > >> > > >> This is the code in our worker: > > >> logger.debug("info : progress: #{progress}") > > >> progress_percent = progress * 100 > > >> if progress_percent >= 100 > > >> results[:progress] = 99.99 > > >> else > > >> results[:progress] = progress_percent <-- Line > of the > > >> crash > > >> end > > >> > > >> Here is the error: > > >> can''t convert Float into Hash > > >> > > >> And the beginning of the stack trace: > > >> > > > /.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > > >> > > >> :in > > >> > > > `merge!''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > > >> > > >> :in `[]=''/.../ContentStore/lib/workers/content_worker.rb:40 > > >> :in > > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:22: > > >> in > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:21 > > >> :in > > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:181 > > >> :in > > >> > > > `do_work''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/worker.rb:55: > > >> > > >> ... > > >> > > >> The problem seems to happen only under heavy load where more > than 1 > > >> worker process is active at the same time. > > >> > > >> Any ideas or leads? > > >> > > >> Thanks, > > >> Robert Bjarnason > > >> > > >> _______________________________________________ > > >> Backgroundrb-devel mailing list > > >> [2]Backgroundrb-devel at rubyforge.org > > >> [3]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > > > > ---- > > > Bob Hutchison -- blogs at > > > <[4]http://www.recursive.ca/hutch/> > > > Recursive Design Inc. -- <[5]http://www.recursive.ca/> > > > Raconteur -- <[6]http://www.raconteur.info/ > > > > > xampl for Ruby -- > > <[7]http://rubyforge.org/projects/xampl/> > > > > > > > > > > > > > > > > _______________________________________________ > > Backgroundrb-devel mailing list > > [8]Backgroundrb-devel at rubyforge.org > > [9]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > > References > > > > Visible links > > 1. mailto:robert.bjarnason at gmail.com > > 2. mailto:Backgroundrb-devel at rubyforge.org > > 3. http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > 4. http://www.recursive.ca/hutch/ > > 5. http://www.recursive.ca/ > > 6. http://www.raconteur.info/ > > 7. http://rubyforge.org/projects/xampl/ > > 8. mailto:Backgroundrb-devel at rubyforge.org > > 9. http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > _______________________________________________ > > Backgroundrb-devel mailing list > > Backgroundrb-devel at rubyforge.org > > http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > -- > ---------------------------------------------------------------------- > |\|\ where in the | s_u_b_s_t_r_u_c_t_i_o_n > | | >=========== W.A.S.T.E. | genarratologies > |/|/ (_) is the wisdom | skaar at waste.org > ---------------------------------------------------------------------- >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20070112/12cf3de9/attachment-0001.html
skaar
2007-Jan-12 20:29 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
I''m going to try to use synchronize from Monitor, which is supposed to be reentrant. I just need to have a closer look on how the MonitorMixin works. /skaar * Mason Hale (masonhale at gmail.com) [070112 12:01]:> Following up on this item, I found that if I remove any calls to add > values to the results_worker hash, the errors described below go away. > > I also found that in some cases the call to ''BackgrouDRb::Results::stored'' > was returning the symbol :backgroundrb_results instead of a hash, thus > triggering the "can''t convert Symbol to Hash" TypeError. > > I made a pass at synchronizing the WorkerResults []= and [] methods via > Mutex, but was unsuccessful. I suspect it may have something to do with > Mutex not being reentrant. > > See: > [1]http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/24470 > [2]http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/172652 > > In any case, avoiding use of results avoids the issue. For the time being, > I''m storing any process state information in a database instead. > > Mason > > On 1/10/07, skaar <[3]skaar at waste.org> wrote: > > It might be that we have to introduce a mutex in the results worker > where this happens. I''ll try to get this reproduced sometime this > weekend. > > /skaar > > * Mason Hale ([4]masonhale at gmail.com) [070109 14:42]: > > I''m getting a similar error, here a partial stack trace: > > > > 20070108-20:17:45 (27597) can''t convert Symbol into Hash - > (TypeError) > > 20070108-20:17:45 (27597) > > /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in > > `merge!'' > > 20070108-20:17:45 (27597) > > /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in > > `[]='' > > > > The process will run fine for hours or days, but then will stop > with this > > error. When it happens it takes now the entire BackgrounDRb server. > (I''m > > addding some begin/rescue blocks to hopefully prevent that.) > > > > I''ve got Ruby 1.8.5p12 compiled with gcc 3.4.6 on RedHat Linux. > > > > This also seems to occur with more than one process (running the > same > > worker class) at the same tim > > > > Any ideas? > > > > Mason > > > > On 1/7/07, Robert Bjarnason <[1]robert.bjarnason@ [5]gmail.com> > wrote: > > > > Hi Bob, > > > > Thanks for the pointer, I agree this is probably not a directly > > backgroundrb related problem. I built Ruby 1.8.4 on a Debian > Linux box > > using gcc version 3.3.5. > > > > Warm regards, > > Robert Bjarnason > > > > Bob Hutchison wrote: > > > Hi, > > > > > > This sounds a bit like the trouble in OS X where you''d get a > Fixednum > > > to String error. This was caused by a bug in the gcc 4.0 > compiler. The > > > solution on OS X is to re-compile Ruby with -O1 optimisation > level or > > > switch to the gcc 3.x compiler). As I understand it, with linux > you > > > have the additional option of installing a newer version of gcc > 4.x > > > and recompiling (I''m using 4.0.2 on one of my linux boxes and > have > > > never seen the problem). > > > > > > Cheers, > > > Bob > > > > > > On 4-Jan-07, at 11:44 AM, Robert Bjarnason wrote: > > > > > >> Hi, > > >> > > >> I''m using backgroundrb 0.2.1 in a production environment and > for most > > >> parts I''m very happy. We are using it to do some heavy video > editing > > on > > >> the server side and it works great except that under, what > seems > > heavy > > >> load the below problem happens intermittently. > > >> > > >> The problem has only happened 5 times out of over 500 runs by > our > > >> backgroundrb worker. > > >> > > >> This is the code in our worker: > > >> logger.debug("info : progress: #{progress}") > > >> progress_percent = progress * 100 > > >> if progress_percent >= 100 > > >> results[:progress] = 99.99 > > >> else > > >> results[:progress] = progress_percent <-- Line > of the > > >> crash > > >> end > > >> > > >> Here is the error: > > >> can''t convert Float into Hash > > >> > > >> And the beginning of the stack trace: > > >> > > /.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > > >> > > >> :in > > >> > > `merge!''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > > >> > > >> :in `[]=''/.../ContentStore/lib/workers/content_worker.rb:40 > > >> :in > > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:22: > > >> in > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:21 > > >> :in > > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:181 > > >> :in > > >> > > `do_work''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/worker.rb:55: > > >> > > >> ... > > >> > > >> The problem seems to happen only under heavy load where more > than 1 > > >> worker process is active at the same time. > > >> > > >> Any ideas or leads? > > >> > > >> Thanks, > > >> Robert Bjarnason > > >> > > >> _______________________________________________ > > >> Backgroundrb-devel mailing list > > >> [2]Backgroundrb-[6]devel at rubyforge.org > > >> [3]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > > > > ---- > > > Bob Hutchison -- blogs at > > > <[4]http://www.recursive.ca/hutch/> > > > Recursive Design Inc. -- <[5]http://www.recursive.ca/> > > > Raconteur -- > <[6]http://www.raconteur.info/ > > > > xampl for Ruby -- > > <[7]http://rubyforge.org/projects/xampl/> > > > > > > > > > > > > > > > > _______________________________________________ > > Backgroundrb-devel mailing list > > [8]Backgroundrb-[7]devel at rubyforge.org > > [9]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > > References > > > > Visible links > > 1. mailto:[8]robert.bjarnason at gmail.com > > 2. mailto:[9]Backgroundrb-devel at rubyforge.org > > 3. [10]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > 4. [11]http://www.recursive.ca/hutch/ > > 5. [12]http://www.recursive.ca/ > > 6. [13]http://www.raconteur.info/ > > 7. [14]http://rubyforge.org/projects/xampl/ > > 8. mailto: [15]Backgroundrb-devel at rubyforge.org > > 9. [16]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > _______________________________________________ > > Backgroundrb-devel mailing list > > [17]Backgroundrb-devel at rubyforge.org > > [18]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > -- > ---------------------------------------------------------------------- > |\|\ where in the | s_u_b_s_t_r_u_c_t_i_o_n > | | >=========== W.A.S.T.E. | genarratologies > |/|/ (_) is the wisdom > | [19]skaar at waste.org > ---------------------------------------------------------------------- > > References > > Visible links > 1. http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/24470 > 2. http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/172652 > 3. mailto:skaar at waste.org > 4. mailto:masonhale at gmail.com > 5. http://gmail.com/ > 6. mailto:devel at rubyforge.org > 7. mailto:devel at rubyforge.org > 8. mailto:robert.bjarnason at gmail.com > 9. mailto:Backgroundrb-devel at rubyforge.org > 10. http://rubyforge.org/mailman/listinfo/backgroundrb-devel > 11. http://www.recursive.ca/hutch/ > 12. http://www.recursive.ca/ > 13. http://www.raconteur.info/ > 14. http://rubyforge.org/projects/xampl/ > 15. mailto:Backgroundrb-devel at rubyforge.org > 16. http://rubyforge.org/mailman/listinfo/backgroundrb-devel > 17. mailto:Backgroundrb-devel at rubyforge.org > 18. http://rubyforge.org/mailman/listinfo/backgroundrb-devel > 19. mailto:skaar at waste.org> _______________________________________________ > Backgroundrb-devel mailing list > Backgroundrb-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/backgroundrb-devel-- ---------------------------------------------------------------------- |\|\ where in the | s_u_b_s_t_r_u_c_t_i_o_n | | >=========== W.A.S.T.E. | genarratologies |/|/ (_) is the wisdom | skaar at waste.org ----------------------------------------------------------------------