Robert Bjarnason
2007-Jan-04 16:44 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
Hi, I''m using backgroundrb 0.2.1 in a production environment and for most parts I''m very happy. We are using it to do some heavy video editing on the server side and it works great except that under, what seems heavy load the below problem happens intermittently. The problem has only happened 5 times out of over 500 runs by our backgroundrb worker. This is the code in our worker: logger.debug("info : progress: #{progress}") progress_percent = progress * 100 if progress_percent >= 100 results[:progress] = 99.99 else results[:progress] = progress_percent <-- Line of the crash end Here is the error: can''t convert Float into Hash And the beginning of the stack trace: /.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 :in `merge!''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 :in `[]=''/.../ContentStore/lib/workers/content_worker.rb:40 :in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:22: in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:21 :in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:181 :in `do_work''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/worker.rb:55: ... The problem seems to happen only under heavy load where more than 1 worker process is active at the same time. Any ideas or leads? Thanks, Robert Bjarnason
Bob Hutchison
2007-Jan-06 13:13 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
Hi, This sounds a bit like the trouble in OS X where you''d get a Fixednum to String error. This was caused by a bug in the gcc 4.0 compiler. The solution on OS X is to re-compile Ruby with -O1 optimisation level or switch to the gcc 3.x compiler). As I understand it, with linux you have the additional option of installing a newer version of gcc 4.x and recompiling (I''m using 4.0.2 on one of my linux boxes and have never seen the problem). Cheers, Bob On 4-Jan-07, at 11:44 AM, Robert Bjarnason wrote:> Hi, > > I''m using backgroundrb 0.2.1 in a production environment and for most > parts I''m very happy. We are using it to do some heavy video > editing on > the server side and it works great except that under, what seems heavy > load the below problem happens intermittently. > > The problem has only happened 5 times out of over 500 runs by our > backgroundrb worker. > > This is the code in our worker: > logger.debug("info : progress: #{progress}") > progress_percent = progress * 100 > if progress_percent >= 100 > results[:progress] = 99.99 > else > results[:progress] = progress_percent <-- Line of > the crash > end > > Here is the error: > can''t convert Float into Hash > > And the beginning of the stack trace: > /.../ContentStore/vendor/plugins/backgroundrb/server/lib/ > backgroundrb/results.rb:40 > :in > `merge!''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/ > backgroundrb/results.rb:40 > :in `[]=''/.../ContentStore/lib/workers/content_worker.rb:40 > :in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb: > 22: > in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:21 > :in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb: > 181 > :in > `do_work''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/ > backgroundrb/worker.rb:55: > ... > > The problem seems to happen only under heavy load where more than 1 > worker process is active at the same time. > > Any ideas or leads? > > Thanks, > Robert Bjarnason > > _______________________________________________ > Backgroundrb-devel mailing list > Backgroundrb-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/backgroundrb-devel---- Bob Hutchison -- blogs at <http://www.recursive.ca/ hutch/> Recursive Design Inc. -- <http://www.recursive.ca/> Raconteur -- <http://www.raconteur.info/> xampl for Ruby -- <http://rubyforge.org/projects/xampl/>
Robert Bjarnason
2007-Jan-07 07:54 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
Hi Bob, Thanks for the pointer, I agree this is probably not a directly backgroundrb related problem. I built Ruby 1.8.4 on a Debian Linux box using gcc version 3.3.5. Warm regards, Robert Bjarnason Bob Hutchison wrote:> Hi, > > This sounds a bit like the trouble in OS X where you''d get a Fixednum > to String error. This was caused by a bug in the gcc 4.0 compiler. The > solution on OS X is to re-compile Ruby with -O1 optimisation level or > switch to the gcc 3.x compiler). As I understand it, with linux you > have the additional option of installing a newer version of gcc 4.x > and recompiling (I''m using 4.0.2 on one of my linux boxes and have > never seen the problem). > > Cheers, > Bob > > On 4-Jan-07, at 11:44 AM, Robert Bjarnason wrote: > >> Hi, >> >> I''m using backgroundrb 0.2.1 in a production environment and for most >> parts I''m very happy. We are using it to do some heavy video editing on >> the server side and it works great except that under, what seems heavy >> load the below problem happens intermittently. >> >> The problem has only happened 5 times out of over 500 runs by our >> backgroundrb worker. >> >> This is the code in our worker: >> logger.debug("info : progress: #{progress}") >> progress_percent = progress * 100 >> if progress_percent >= 100 >> results[:progress] = 99.99 >> else >> results[:progress] = progress_percent <-- Line of the >> crash >> end >> >> Here is the error: >> can''t convert Float into Hash >> >> And the beginning of the stack trace: >> /.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 >> >> :in >> `merge!''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 >> >> :in `[]=''/.../ContentStore/lib/workers/content_worker.rb:40 >> :in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:22: >> in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:21 >> :in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:181 >> :in >> `do_work''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/worker.rb:55: >> >> ... >> >> The problem seems to happen only under heavy load where more than 1 >> worker process is active at the same time. >> >> Any ideas or leads? >> >> Thanks, >> Robert Bjarnason >> >> _______________________________________________ >> Backgroundrb-devel mailing list >> Backgroundrb-devel at rubyforge.org >> http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > ---- > Bob Hutchison -- blogs at > <http://www.recursive.ca/hutch/> > Recursive Design Inc. -- <http://www.recursive.ca/> > Raconteur -- <http://www.raconteur.info/> > xampl for Ruby -- <http://rubyforge.org/projects/xampl/> > > > >
Mason Hale
2007-Jan-09 20:54 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
I''m getting a similar error, here a partial stack trace: 20070108-20:17:45 (27597) can''t convert Symbol into Hash - (TypeError) 20070108-20:17:45 (27597) /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in `merge!'' 20070108-20:17:45 (27597) /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in `[]='' The process will run fine for hours or days, but then will stop with this error. When it happens it takes now the entire BackgrounDRb server. (I''m addding some begin/rescue blocks to hopefully prevent that.) I''ve got Ruby 1.8.5p12 compiled with gcc 3.4.6 on RedHat Linux. This also seems to occur with more than one process (running the same worker class) at the same tim Any ideas? Mason On 1/7/07, Robert Bjarnason <robert.bjarnason at gmail.com> wrote:> > Hi Bob, > > Thanks for the pointer, I agree this is probably not a directly > backgroundrb related problem. I built Ruby 1.8.4 on a Debian Linux box > using gcc version 3.3.5. > > Warm regards, > Robert Bjarnason > > Bob Hutchison wrote: > > Hi, > > > > This sounds a bit like the trouble in OS X where you''d get a Fixednum > > to String error. This was caused by a bug in the gcc 4.0 compiler. The > > solution on OS X is to re-compile Ruby with -O1 optimisation level or > > switch to the gcc 3.x compiler). As I understand it, with linux you > > have the additional option of installing a newer version of gcc 4.x > > and recompiling (I''m using 4.0.2 on one of my linux boxes and have > > never seen the problem). > > > > Cheers, > > Bob > > > > On 4-Jan-07, at 11:44 AM, Robert Bjarnason wrote: > > > >> Hi, > >> > >> I''m using backgroundrb 0.2.1 in a production environment and for most > >> parts I''m very happy. We are using it to do some heavy video editing > on > >> the server side and it works great except that under, what seems heavy > >> load the below problem happens intermittently. > >> > >> The problem has only happened 5 times out of over 500 runs by our > >> backgroundrb worker. > >> > >> This is the code in our worker: > >> logger.debug("info : progress: #{progress}") > >> progress_percent = progress * 100 > >> if progress_percent >= 100 > >> results[:progress] = 99.99 > >> else > >> results[:progress] = progress_percent <-- Line of the > >> crash > >> end > >> > >> Here is the error: > >> can''t convert Float into Hash > >> > >> And the beginning of the stack trace: > >> > /.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > >> > >> :in > >> > `merge!''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > >> > >> :in `[]=''/.../ContentStore/lib/workers/content_worker.rb:40 > >> :in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:22: > >> in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:21 > >> :in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:181 > >> :in > >> > `do_work''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/worker.rb:55: > >> > >> ... > >> > >> The problem seems to happen only under heavy load where more than 1 > >> worker process is active at the same time. > >> > >> Any ideas or leads? > >> > >> Thanks, > >> Robert Bjarnason > >> > >> _______________________________________________ > >> Backgroundrb-devel mailing list > >> Backgroundrb-devel at rubyforge.org > >> http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > > ---- > > Bob Hutchison -- blogs at > > <http://www.recursive.ca/hutch/> > > Recursive Design Inc. -- <http://www.recursive.ca/> > > Raconteur -- <http://www.raconteur.info/> > > xampl for Ruby -- <http://rubyforge.org/projects/xampl/> > > > > > > > > > > _______________________________________________ > Backgroundrb-devel mailing list > Backgroundrb-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/backgroundrb-devel >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20070109/f8c01274/attachment.html
Robert Bjarnason
2007-Jan-10 03:21 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
Hi Mason, I''m still seeing the same problem with Float but in my case the backgroundrb server keeps on running fine and as people can retry on the client side this is not a blocker problem for us at the moment. Compiler bugs are now less likely as you are using a different version of Ruby, GCC and a different Linux distribution. Maybe the lead here is the fact that in both our cases more than one backgroundrb process is running when the crash happens, this should make it easy to create a test case to be able to replicate the problem to this - unfortunately I''ve only had the pleasure of working with Ruby for a couple of months so don''t think I have the expertise needed to debug this problem. As I understand backgroundrb then there would be two completely isolated Ruby VMs running our code so maybe this is a backgroundrb problem after all? Warm regards, Robert Bjarnason Mason Hale wrote:> I''m getting a similar error, here a partial stack trace: > > 20070108-20:17:45 (27597) can''t convert Symbol into Hash - (TypeError) > 20070108-20:17:45 (27597) > /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in > `merge!'' > 20070108-20:17:45 (27597) > /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in > `[]='' > > The process will run fine for hours or days, but then will stop with > this error. When it happens it takes now the entire BackgrounDRb > server. (I''m addding some begin/rescue blocks to hopefully prevent that.) > > I''ve got Ruby 1.8.5p12 compiled with gcc 3.4.6 on RedHat Linux. > > This also seems to occur with more than one process (running the same > worker class) at the same tim > > Any ideas? > > Mason > > On 1/7/07, *Robert Bjarnason* <robert.bjarnason at gmail.com > <mailto:robert.bjarnason at gmail.com>> wrote: > > Hi Bob, > > Thanks for the pointer, I agree this is probably not a directly > backgroundrb related problem. I built Ruby 1.8.4 on a Debian > Linux box > using gcc version 3.3.5. > > Warm regards, > Robert Bjarnason > > Bob Hutchison wrote: > > Hi, > > > > This sounds a bit like the trouble in OS X where you''d get a > Fixednum > > to String error. This was caused by a bug in the gcc 4.0 > compiler. The > > solution on OS X is to re-compile Ruby with -O1 optimisation > level or > > switch to the gcc 3.x compiler). As I understand it, with linux you > > have the additional option of installing a newer version of gcc 4.x > > and recompiling (I''m using 4.0.2 on one of my linux boxes and have > > never seen the problem). > > > > Cheers, > > Bob > > > > On 4-Jan-07, at 11:44 AM, Robert Bjarnason wrote: > > > >> Hi, > >> > >> I''m using backgroundrb 0.2.1 in a production environment and > for most > >> parts I''m very happy. We are using it to do some heavy video > editing on > >> the server side and it works great except that under, what > seems heavy > >> load the below problem happens intermittently. > >> > >> The problem has only happened 5 times out of over 500 runs by our > >> backgroundrb worker. > >> > >> This is the code in our worker: > >> logger.debug("info : progress: #{progress}") > >> progress_percent = progress * 100 > >> if progress_percent >= 100 > >> results[:progress] = 99.99 > >> else > >> results[:progress] = progress_percent <-- Line > of the > >> crash > >> end > >> > >> Here is the error: > >> can''t convert Float into Hash > >> > >> And the beginning of the stack trace: > >> > /.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > > >> > >> :in > >> > `merge!''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > >> > >> :in `[]=''/.../ContentStore/lib/workers/content_worker.rb:40 > >> :in > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:22: > >> in > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:21 > >> :in > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:181 > >> :in > >> > `do_work''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/worker.rb:55: > >> > >> ... > >> > >> The problem seems to happen only under heavy load where more > than 1 > >> worker process is active at the same time. > >> > >> Any ideas or leads? > >> > >> Thanks, > >> Robert Bjarnason > >> > >> _______________________________________________ > >> Backgroundrb-devel mailing list > >> Backgroundrb-devel at rubyforge.org > <mailto:Backgroundrb-devel at rubyforge.org> > >> http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > > ---- > > Bob Hutchison -- blogs at > > <http://www.recursive.ca/hutch/> > > Recursive Design Inc. -- <http://www.recursive.ca/> > > Raconteur -- <http://www.raconteur.info/ > <http://www.raconteur.info/>> > > xampl for Ruby -- > <http://rubyforge.org/projects/xampl/> > > > > > > > > > > _______________________________________________ > Backgroundrb-devel mailing list > Backgroundrb-devel at rubyforge.org > <mailto:Backgroundrb-devel at rubyforge.org> > http://rubyforge.org/mailman/listinfo/backgroundrb-devel > <http://rubyforge.org/mailman/listinfo/backgroundrb-devel> > >
skaar
2007-Jan-10 13:05 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
It might be that we have to introduce a mutex in the results worker where this happens. I''ll try to get this reproduced sometime this weekend. /skaar * Mason Hale (masonhale at gmail.com) [070109 14:42]:> I''m getting a similar error, here a partial stack trace: > > 20070108-20:17:45 (27597) can''t convert Symbol into Hash - (TypeError) > 20070108-20:17:45 (27597) > /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in > `merge!'' > 20070108-20:17:45 (27597) > /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in > `[]='' > > The process will run fine for hours or days, but then will stop with this > error. When it happens it takes now the entire BackgrounDRb server. (I''m > addding some begin/rescue blocks to hopefully prevent that.) > > I''ve got Ruby 1.8.5p12 compiled with gcc 3.4.6 on RedHat Linux. > > This also seems to occur with more than one process (running the same > worker class) at the same tim > > Any ideas? > > Mason > > On 1/7/07, Robert Bjarnason <[1]robert.bjarnason at gmail.com> wrote: > > Hi Bob, > > Thanks for the pointer, I agree this is probably not a directly > backgroundrb related problem. I built Ruby 1.8.4 on a Debian Linux box > using gcc version 3.3.5. > > Warm regards, > Robert Bjarnason > > Bob Hutchison wrote: > > Hi, > > > > This sounds a bit like the trouble in OS X where you''d get a Fixednum > > to String error. This was caused by a bug in the gcc 4.0 compiler. The > > solution on OS X is to re-compile Ruby with -O1 optimisation level or > > switch to the gcc 3.x compiler). As I understand it, with linux you > > have the additional option of installing a newer version of gcc 4.x > > and recompiling (I''m using 4.0.2 on one of my linux boxes and have > > never seen the problem). > > > > Cheers, > > Bob > > > > On 4-Jan-07, at 11:44 AM, Robert Bjarnason wrote: > > > >> Hi, > >> > >> I''m using backgroundrb 0.2.1 in a production environment and for most > >> parts I''m very happy. We are using it to do some heavy video editing > on > >> the server side and it works great except that under, what seems > heavy > >> load the below problem happens intermittently. > >> > >> The problem has only happened 5 times out of over 500 runs by our > >> backgroundrb worker. > >> > >> This is the code in our worker: > >> logger.debug("info : progress: #{progress}") > >> progress_percent = progress * 100 > >> if progress_percent >= 100 > >> results[:progress] = 99.99 > >> else > >> results[:progress] = progress_percent <-- Line of the > >> crash > >> end > >> > >> Here is the error: > >> can''t convert Float into Hash > >> > >> And the beginning of the stack trace: > >> > /.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > >> > >> :in > >> > `merge!''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > >> > >> :in `[]=''/.../ContentStore/lib/workers/content_worker.rb:40 > >> :in > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:22: > >> in `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:21 > >> :in > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:181 > >> :in > >> > `do_work''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/worker.rb:55: > >> > >> ... > >> > >> The problem seems to happen only under heavy load where more than 1 > >> worker process is active at the same time. > >> > >> Any ideas or leads? > >> > >> Thanks, > >> Robert Bjarnason > >> > >> _______________________________________________ > >> Backgroundrb-devel mailing list > >> [2]Backgroundrb-devel at rubyforge.org > >> [3]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > > ---- > > Bob Hutchison -- blogs at > > <[4]http://www.recursive.ca/hutch/> > > Recursive Design Inc. -- <[5]http://www.recursive.ca/> > > Raconteur -- <[6]http://www.raconteur.info/ > > > xampl for Ruby -- > <[7]http://rubyforge.org/projects/xampl/> > > > > > > > > > > _______________________________________________ > Backgroundrb-devel mailing list > [8]Backgroundrb-devel at rubyforge.org > [9]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > References > > Visible links > 1. mailto:robert.bjarnason at gmail.com > 2. mailto:Backgroundrb-devel at rubyforge.org > 3. http://rubyforge.org/mailman/listinfo/backgroundrb-devel > 4. http://www.recursive.ca/hutch/ > 5. http://www.recursive.ca/ > 6. http://www.raconteur.info/ > 7. http://rubyforge.org/projects/xampl/ > 8. mailto:Backgroundrb-devel at rubyforge.org > 9. http://rubyforge.org/mailman/listinfo/backgroundrb-devel> _______________________________________________ > Backgroundrb-devel mailing list > Backgroundrb-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/backgroundrb-devel-- ---------------------------------------------------------------------- |\|\ where in the | s_u_b_s_t_r_u_c_t_i_o_n | | >=========== W.A.S.T.E. | genarratologies |/|/ (_) is the wisdom | skaar at waste.org ----------------------------------------------------------------------
Mason Hale
2007-Jan-10 19:39 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
I''ve done some more work on this and have created a test case that reliably throws errors, although the errors themselves are not consistent. About 1 out of every 4 times, I get the "can''t convert Symbol to Hash" error in server/lib/backgroundrb/results.rb:40 in ''merge!''. I created the following worker class is {RAILS_ROOT}/lib/workers/results_test_worker.rb # This class repeatedly writes values to the results, to # test the results process class ResultsTestWorker < BackgrounDRb::Worker::RailsBase def do_work(args) logger.info "Started ResultsTestWorker" results[:started_at] = Time.now args ||= {} limit = args[:limit] || 10_000 logger.info "Limit is #{limit}" limit.times do |i| results[:last_update] = Time.now results[:counter] = i end stop_time = Time.now logger.info "Stopped ResultsTestWorker at #{stop_time}" results[:stopped_at] = stop_time self.delete end end ResultsTestWorker.register Then in {RAILS_ROOT}/test/unit/drb_results_test.rb I have: require File.dirname(__FILE__) + ''/../test_helper'' class DrbResultsTest < Test::Unit::TestCase def setup # start backgroundrb server `../../script/backgroundrb start` sleep 5 # give it time to startup end def teardown # stop backgroundrb server `../../script/backgroundrb stop` end def test_results limit = 10 keys = [] 4.times do |i| job_key = "#{self.class.name}_#{i}" keys << job_key MiddleMan.new_worker(:class => :results_test_worker, :job_key => job_key, :args => {:limit => limit}) end sleep 2 # wait for workers to finish keys.each_with_index do |k, i| assert_not_nil MiddleMan[k], "checking job_key #{k} on iteration #{i}" assert_not_nil MiddleMan[k].object, "checking object on iteration #{i}" assert_not_nil MiddleMan[k].object.results, "checking results on iteration #{i}" assert_equal(limit - 1, MiddleMan[k].object.results.to_hash[:counter], "checking counter on iteration #{i}") end end end This test does the following: - Spawns 4 results_test_worker processes that each write several values to the ResultsWorker (in parallel) Increasing the limit value increases the odds of these processes concurrently trying to write results at the same time, but I''ve found that a limit of 10 works pretty well. - It waits a couple seconds for the workers to finish (is there a better way to determine if the processes are all done)? - Then it tries to access the results for each job_key, specifically to ensure that counter value is equal to limit - 1. NOTE: I''ve never gotten this test to complete successfully. In addition to the "can''t convert Symbol to Hash" error, I''ve seen the following: - The [:counter] value is much lower than the expected value. If limit is 10,000 this value might be 246 when 9,999 was expected. - The job_key is not recognized, the call to MiddleMan[k] returns nil. When this occurs, I can usually see in the backgroundrb.log that fewer than 4 workers were actually created. I can see this by counting the number of "Started ResultsTestWorker" messages in the log. - The job_key is resolved, but the call to MiddleMan[k].object.results returns nil - The call to MiddleMan.new_worker hangs and never returns I''m sharing this code so that others can try it out. It''s a bit of a hack to get some testing working (starting and stopping the BackgrounDRb server on each test, having a test worker class in lib/workers, etc.), but it is self-contained, and replicated the real-world environment of my code running in rails. It you have suggestions for improving the testing approach I''m all ears. I''m also interested in feedback in the code itself. Maybe I''m not working with the MiddleMan object correctly. I have to admit I''m still wrapping my head around Drb. Resolving this issue is critical to my project so I will continue trying to track things down. I''ll start by adding a mutex to the Results#[]= method. Mason On 1/10/07, skaar <skaar at waste.org> wrote:> > It might be that we have to introduce a mutex in the results worker > where this happens. I''ll try to get this reproduced sometime this > weekend. > > /skaar > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20070110/bfd945b4/attachment.html
skaar
2007-Jan-10 21:05 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
I do have a test case that is close to this, which at this point has shown the exception once on a 10K result assignment with 4 workers. (Mason - could you a ticket for this?) * Mason Hale (masonhale at gmail.com) [070110 13:27]:> I''ve done some more work on this and have created a test case that > reliably throws errors, although the errors themselves are not consistent. > About 1 out of every 4 times, I get the "can''t convert Symbol to Hash" > error in server/lib/backgroundrb/results.rb:40 in ''merge!''. > > keys.each_with_index do |k, i| > assert_not_nil MiddleMan[k], "checking job_key #{k} on iteration > #{i}" > assert_not_nil MiddleMan[k].object, "checking object on iteration > #{i}" > assert_not_nil MiddleMan[k].object.results, "checking results on > iteration #{i}" > assert_equal(limit - 1, > MiddleMan[k].object.results.to_hash[:counter], "checking counter on > iteration #{i}") > end > > end > endyou should probably use MiddleMan.worker(k) here, which will benefit from the WorkerProxy (this is an inconsistency that I had overlooked) where you are re-directed directly to the results worker after the worker itself has gone away. so: MiddleMan.worker(k).results also MiddleMan.worker(k).results[:counter] should work as well. Another thing that worries me a little bit is that I see very different completion time. Everything from 10 minutes to almost 1/2 hour - that is with 4 x 10k results. /skaar -- ---------------------------------------------------------------------- |\|\ where in the | s_u_b_s_t_r_u_c_t_i_o_n | | >=========== W.A.S.T.E. | genarratologies |/|/ (_) is the wisdom | skaar at waste.org ----------------------------------------------------------------------
Mason Hale
2007-Jan-10 21:22 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
Thanks skaar. I''ll add a ticket. FYI -- I''m running my tests on a Macbook, OS X 10.4 w/ Ruby 1.8.4 On 1/10/07, skaar <skaar at waste.org> wrote:> > I do have a test case that is close to this, which at this point has > shown the exception once on a 10K result assignment with 4 workers. > > (Mason - could you a ticket for this?) > > * Mason Hale (masonhale at gmail.com) [070110 13:27]: > > I''ve done some more work on this and have created a test case that > > reliably throws errors, although the errors themselves are not > consistent. > > About 1 out of every 4 times, I get the "can''t convert Symbol to > Hash" > > error in server/lib/backgroundrb/results.rb:40 in ''merge!''. > > > > keys.each_with_index do |k, i| > > assert_not_nil MiddleMan[k], "checking job_key #{k} on > iteration > > #{i}" > > assert_not_nil MiddleMan[k].object, "checking object on > iteration > > #{i}" > > assert_not_nil MiddleMan[k].object.results, "checking results > on > > iteration #{i}" > > assert_equal(limit - 1, > > MiddleMan[k].object.results.to_hash[:counter], "checking counter on > > iteration #{i}") > > end > > > > end > > end > > you should probably use MiddleMan.worker(k) here, which will benefit > from the WorkerProxy (this is an inconsistency that I had overlooked) > where you are re-directed directly to the results worker after the > worker itself has gone away. so: > > MiddleMan.worker(k).results > > also > > MiddleMan.worker(k).results[:counter] > > should work as well. > > Another thing that worries me a little bit is that I see very different > completion time. Everything from 10 minutes to almost 1/2 hour - that is > with 4 x 10k results. > > /skaar > > > -- > ---------------------------------------------------------------------- > |\|\ where in the | s_u_b_s_t_r_u_c_t_i_o_n > | | >=========== W.A.S.T.E. | genarratologies > |/|/ (_) is the wisdom | skaar at waste.org > ---------------------------------------------------------------------- >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20070110/426bdb60/attachment-0001.html
Mason Hale
2007-Jan-12 17:57 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
Following up on this item, I found that if I remove any calls to add values to the results_worker hash, the errors described below go away. I also found that in some cases the call to ''BackgrouDRb::Results::stored'' was returning the symbol :backgroundrb_results instead of a hash, thus triggering the "can''t convert Symbol to Hash" TypeError. I made a pass at synchronizing the WorkerResults []= and [] methods via Mutex, but was unsuccessful. I suspect it may have something to do with Mutex not being reentrant. See: http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/24470 http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/172652 In any case, avoiding use of results avoids the issue. For the time being, I''m storing any process state information in a database instead. Mason On 1/10/07, skaar <skaar at waste.org> wrote:> > It might be that we have to introduce a mutex in the results worker > where this happens. I''ll try to get this reproduced sometime this > weekend. > > /skaar > > > * Mason Hale (masonhale at gmail.com) [070109 14:42]: > > I''m getting a similar error, here a partial stack trace: > > > > 20070108-20:17:45 (27597) can''t convert Symbol into Hash - > (TypeError) > > 20070108-20:17:45 (27597) > > > /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in > > `merge!'' > > 20070108-20:17:45 (27597) > > > /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in > > `[]='' > > > > The process will run fine for hours or days, but then will stop with > this > > error. When it happens it takes now the entire BackgrounDRb server. > (I''m > > addding some begin/rescue blocks to hopefully prevent that.) > > > > I''ve got Ruby 1.8.5p12 compiled with gcc 3.4.6 on RedHat Linux. > > > > This also seems to occur with more than one process (running the same > > worker class) at the same tim > > > > Any ideas? > > > > Mason > > > > On 1/7/07, Robert Bjarnason <[1]robert.bjarnason at gmail.com> wrote: > > > > Hi Bob, > > > > Thanks for the pointer, I agree this is probably not a directly > > backgroundrb related problem. I built Ruby 1.8.4 on a Debian Linux > box > > using gcc version 3.3.5. > > > > Warm regards, > > Robert Bjarnason > > > > Bob Hutchison wrote: > > > Hi, > > > > > > This sounds a bit like the trouble in OS X where you''d get a > Fixednum > > > to String error. This was caused by a bug in the gcc 4.0compiler. The > > > solution on OS X is to re-compile Ruby with -O1 optimisation > level or > > > switch to the gcc 3.x compiler). As I understand it, with linux > you > > > have the additional option of installing a newer version of gcc > 4.x > > > and recompiling (I''m using 4.0.2 on one of my linux boxes and > have > > > never seen the problem). > > > > > > Cheers, > > > Bob > > > > > > On 4-Jan-07, at 11:44 AM, Robert Bjarnason wrote: > > > > > >> Hi, > > >> > > >> I''m using backgroundrb 0.2.1 in a production environment and for > most > > >> parts I''m very happy. We are using it to do some heavy video > editing > > on > > >> the server side and it works great except that under, what seems > > heavy > > >> load the below problem happens intermittently. > > >> > > >> The problem has only happened 5 times out of over 500 runs by > our > > >> backgroundrb worker. > > >> > > >> This is the code in our worker: > > >> logger.debug("info : progress: #{progress}") > > >> progress_percent = progress * 100 > > >> if progress_percent >= 100 > > >> results[:progress] = 99.99 > > >> else > > >> results[:progress] = progress_percent <-- Line > of the > > >> crash > > >> end > > >> > > >> Here is the error: > > >> can''t convert Float into Hash > > >> > > >> And the beginning of the stack trace: > > >> > > > /.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > > >> > > >> :in > > >> > > > `merge!''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > > >> > > >> :in `[]=''/.../ContentStore/lib/workers/content_worker.rb:40 > > >> :in > > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:22: > > >> in > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:21 > > >> :in > > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:181 > > >> :in > > >> > > > `do_work''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/worker.rb:55: > > >> > > >> ... > > >> > > >> The problem seems to happen only under heavy load where more > than 1 > > >> worker process is active at the same time. > > >> > > >> Any ideas or leads? > > >> > > >> Thanks, > > >> Robert Bjarnason > > >> > > >> _______________________________________________ > > >> Backgroundrb-devel mailing list > > >> [2]Backgroundrb-devel at rubyforge.org > > >> [3]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > > > > ---- > > > Bob Hutchison -- blogs at > > > <[4]http://www.recursive.ca/hutch/> > > > Recursive Design Inc. -- <[5]http://www.recursive.ca/> > > > Raconteur -- <[6]http://www.raconteur.info/ > > > > > xampl for Ruby -- > > <[7]http://rubyforge.org/projects/xampl/> > > > > > > > > > > > > > > > > _______________________________________________ > > Backgroundrb-devel mailing list > > [8]Backgroundrb-devel at rubyforge.org > > [9]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > > References > > > > Visible links > > 1. mailto:robert.bjarnason at gmail.com > > 2. mailto:Backgroundrb-devel at rubyforge.org > > 3. http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > 4. http://www.recursive.ca/hutch/ > > 5. http://www.recursive.ca/ > > 6. http://www.raconteur.info/ > > 7. http://rubyforge.org/projects/xampl/ > > 8. mailto:Backgroundrb-devel at rubyforge.org > > 9. http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > _______________________________________________ > > Backgroundrb-devel mailing list > > Backgroundrb-devel at rubyforge.org > > http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > -- > ---------------------------------------------------------------------- > |\|\ where in the | s_u_b_s_t_r_u_c_t_i_o_n > | | >=========== W.A.S.T.E. | genarratologies > |/|/ (_) is the wisdom | skaar at waste.org > ---------------------------------------------------------------------- >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20070112/12cf3de9/attachment-0001.html
skaar
2007-Jan-12 20:29 UTC
[Backgroundrb-devel] Intermittent "can''t convert Float into Hash" and results.rb
I''m going to try to use synchronize from Monitor, which is supposed to be reentrant. I just need to have a closer look on how the MonitorMixin works. /skaar * Mason Hale (masonhale at gmail.com) [070112 12:01]:> Following up on this item, I found that if I remove any calls to add > values to the results_worker hash, the errors described below go away. > > I also found that in some cases the call to ''BackgrouDRb::Results::stored'' > was returning the symbol :backgroundrb_results instead of a hash, thus > triggering the "can''t convert Symbol to Hash" TypeError. > > I made a pass at synchronizing the WorkerResults []= and [] methods via > Mutex, but was unsuccessful. I suspect it may have something to do with > Mutex not being reentrant. > > See: > [1]http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/24470 > [2]http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/172652 > > In any case, avoiding use of results avoids the issue. For the time being, > I''m storing any process state information in a database instead. > > Mason > > On 1/10/07, skaar <[3]skaar at waste.org> wrote: > > It might be that we have to introduce a mutex in the results worker > where this happens. I''ll try to get this reproduced sometime this > weekend. > > /skaar > > * Mason Hale ([4]masonhale at gmail.com) [070109 14:42]: > > I''m getting a similar error, here a partial stack trace: > > > > 20070108-20:17:45 (27597) can''t convert Symbol into Hash - > (TypeError) > > 20070108-20:17:45 (27597) > > /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in > > `merge!'' > > 20070108-20:17:45 (27597) > > /opt/lss/demo/0.4-06/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40:in > > `[]='' > > > > The process will run fine for hours or days, but then will stop > with this > > error. When it happens it takes now the entire BackgrounDRb server. > (I''m > > addding some begin/rescue blocks to hopefully prevent that.) > > > > I''ve got Ruby 1.8.5p12 compiled with gcc 3.4.6 on RedHat Linux. > > > > This also seems to occur with more than one process (running the > same > > worker class) at the same tim > > > > Any ideas? > > > > Mason > > > > On 1/7/07, Robert Bjarnason <[1]robert.bjarnason@ [5]gmail.com> > wrote: > > > > Hi Bob, > > > > Thanks for the pointer, I agree this is probably not a directly > > backgroundrb related problem. I built Ruby 1.8.4 on a Debian > Linux box > > using gcc version 3.3.5. > > > > Warm regards, > > Robert Bjarnason > > > > Bob Hutchison wrote: > > > Hi, > > > > > > This sounds a bit like the trouble in OS X where you''d get a > Fixednum > > > to String error. This was caused by a bug in the gcc 4.0 > compiler. The > > > solution on OS X is to re-compile Ruby with -O1 optimisation > level or > > > switch to the gcc 3.x compiler). As I understand it, with linux > you > > > have the additional option of installing a newer version of gcc > 4.x > > > and recompiling (I''m using 4.0.2 on one of my linux boxes and > have > > > never seen the problem). > > > > > > Cheers, > > > Bob > > > > > > On 4-Jan-07, at 11:44 AM, Robert Bjarnason wrote: > > > > > >> Hi, > > >> > > >> I''m using backgroundrb 0.2.1 in a production environment and > for most > > >> parts I''m very happy. We are using it to do some heavy video > editing > > on > > >> the server side and it works great except that under, what > seems > > heavy > > >> load the below problem happens intermittently. > > >> > > >> The problem has only happened 5 times out of over 500 runs by > our > > >> backgroundrb worker. > > >> > > >> This is the code in our worker: > > >> logger.debug("info : progress: #{progress}") > > >> progress_percent = progress * 100 > > >> if progress_percent >= 100 > > >> results[:progress] = 99.99 > > >> else > > >> results[:progress] = progress_percent <-- Line > of the > > >> crash > > >> end > > >> > > >> Here is the error: > > >> can''t convert Float into Hash > > >> > > >> And the beginning of the stack trace: > > >> > > /.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > > >> > > >> :in > > >> > > `merge!''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/results.rb:40 > > >> > > >> :in `[]=''/.../ContentStore/lib/workers/content_worker.rb:40 > > >> :in > > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:22: > > >> in > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:21 > > >> :in > > `execute_mp4box''/.../ContentStore/lib/workers/content_worker.rb:181 > > >> :in > > >> > > `do_work''/.../ContentStore/vendor/plugins/backgroundrb/server/lib/backgroundrb/worker.rb:55: > > >> > > >> ... > > >> > > >> The problem seems to happen only under heavy load where more > than 1 > > >> worker process is active at the same time. > > >> > > >> Any ideas or leads? > > >> > > >> Thanks, > > >> Robert Bjarnason > > >> > > >> _______________________________________________ > > >> Backgroundrb-devel mailing list > > >> [2]Backgroundrb-[6]devel at rubyforge.org > > >> [3]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > > > > ---- > > > Bob Hutchison -- blogs at > > > <[4]http://www.recursive.ca/hutch/> > > > Recursive Design Inc. -- <[5]http://www.recursive.ca/> > > > Raconteur -- > <[6]http://www.raconteur.info/ > > > > xampl for Ruby -- > > <[7]http://rubyforge.org/projects/xampl/> > > > > > > > > > > > > > > > > _______________________________________________ > > Backgroundrb-devel mailing list > > [8]Backgroundrb-[7]devel at rubyforge.org > > [9]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > > References > > > > Visible links > > 1. mailto:[8]robert.bjarnason at gmail.com > > 2. mailto:[9]Backgroundrb-devel at rubyforge.org > > 3. [10]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > 4. [11]http://www.recursive.ca/hutch/ > > 5. [12]http://www.recursive.ca/ > > 6. [13]http://www.raconteur.info/ > > 7. [14]http://rubyforge.org/projects/xampl/ > > 8. mailto: [15]Backgroundrb-devel at rubyforge.org > > 9. [16]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > > _______________________________________________ > > Backgroundrb-devel mailing list > > [17]Backgroundrb-devel at rubyforge.org > > [18]http://rubyforge.org/mailman/listinfo/backgroundrb-devel > > -- > ---------------------------------------------------------------------- > |\|\ where in the | s_u_b_s_t_r_u_c_t_i_o_n > | | >=========== W.A.S.T.E. | genarratologies > |/|/ (_) is the wisdom > | [19]skaar at waste.org > ---------------------------------------------------------------------- > > References > > Visible links > 1. http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/24470 > 2. http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/172652 > 3. mailto:skaar at waste.org > 4. mailto:masonhale at gmail.com > 5. http://gmail.com/ > 6. mailto:devel at rubyforge.org > 7. mailto:devel at rubyforge.org > 8. mailto:robert.bjarnason at gmail.com > 9. mailto:Backgroundrb-devel at rubyforge.org > 10. http://rubyforge.org/mailman/listinfo/backgroundrb-devel > 11. http://www.recursive.ca/hutch/ > 12. http://www.recursive.ca/ > 13. http://www.raconteur.info/ > 14. http://rubyforge.org/projects/xampl/ > 15. mailto:Backgroundrb-devel at rubyforge.org > 16. http://rubyforge.org/mailman/listinfo/backgroundrb-devel > 17. mailto:Backgroundrb-devel at rubyforge.org > 18. http://rubyforge.org/mailman/listinfo/backgroundrb-devel > 19. mailto:skaar at waste.org> _______________________________________________ > Backgroundrb-devel mailing list > Backgroundrb-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/backgroundrb-devel-- ---------------------------------------------------------------------- |\|\ where in the | s_u_b_s_t_r_u_c_t_i_o_n | | >=========== W.A.S.T.E. | genarratologies |/|/ (_) is the wisdom | skaar at waste.org ----------------------------------------------------------------------