Daniel Azuma
2007-Jul-29 18:29 UTC
[Backgroundrb-devel] Server dying with perpetual "Connection reset by peer"
I''m at a loss to explain a very strange error I''m getting. This seems to happen on our production system where we have a backgroundrb server sharing a host with a Rails app running on Mongrel. The first one or two, occasionally three, calls to a worker (direct MiddleMan calls from Rails, not scheduled) succeed, but afterwards, the server fails to initialize new workers. The following exception appears in the server logs, and I have to restart the server to get it to work again (for one or two requests before it dies again). This appears to be the same issue that Peer Allan observed back in May-- in that case the workers were triggered on a schedule, but in my case it is triggered via a remote MiddleMan. Does anyone have any idea what could cause this? Sample stack trace below: Connection reset by peer - (DRb::DRbConnError) /usr/local/lib/ruby/1.8/drb/drb.rb:566:in `read'' /usr/local/lib/ruby/1.8/drb/drb.rb:566:in `load'' /usr/local/lib/ruby/1.8/drb/drb.rb:632:in `recv_reply'' /usr/local/lib/ruby/1.8/drb/drb.rb:921:in `recv_reply'' /usr/local/lib/ruby/1.8/drb/drb.rb:1195:in `send_message'' /usr/local/lib/ruby/1.8/drb/drb.rb:1086:in `method_missing'' /usr/local/lib/ruby/1.8/drb/drb.rb:1170:in `open'' /usr/local/lib/ruby/1.8/drb/drb.rb:1085:in `method_missing'' /usr/local/lib/ruby/1.8/drb/drb.rb:1103:in `with_friend'' /usr/local/lib/ruby/1.8/drb/drb.rb:1084:in `method_missing'' /usr/local/lib/ruby/1.8/drb/drb.rb:1072:in `respond_to?'' /usr/local/lib/ruby/gems/1.8/gems/slave-1.2.1/lib/slave.rb:454:in `initialize'' /path/to/my/app/vendor/plugins/backgroundrb/server/lib/backgroundrb/ middleman.rb:210:in `new'' /path/to/my/app/vendor/plugins/backgroundrb/server/lib/backgroundrb/ middleman.rb:210:in `new_worker'' /path/to/my/app/vendor/plugins/backgroundrb/server/lib/backgroundrb/ thread_pool.rb:36:in `dispatch'' /path/to/my/app/vendor/plugins/backgroundrb/server/lib/backgroundrb/ thread_pool.rb:22:in `initialize'' /path/to/my/app/vendor/plugins/backgroundrb/server/lib/backgroundrb/ thread_pool.rb:22:in `new'' /path/to/my/app/vendor/plugins/backgroundrb/server/lib/backgroundrb/ thread_pool.rb:22:in `dispatch'' /path/to/my/app/vendor/plugins/backgroundrb/server/lib/backgroundrb/ middleman.rb:199:in `new_worker'' /usr/local/lib/ruby/1.8/drb/drb.rb:1555:in `__send__'' /usr/local/lib/ruby/1.8/drb/drb.rb:1555:in `perform_without_block'' /usr/local/lib/ruby/1.8/drb/drb.rb:1515:in `perform'' /usr/local/lib/ruby/1.8/drb/drb.rb:1589:in `main_loop'' /usr/local/lib/ruby/1.8/drb/drb.rb:1585:in `loop'' /usr/local/lib/ruby/1.8/drb/drb.rb:1585:in `main_loop'' /usr/local/lib/ruby/1.8/drb/drb.rb:1581:in `start'' /usr/local/lib/ruby/1.8/drb/drb.rb:1581:in `main_loop'' /usr/local/lib/ruby/1.8/drb/drb.rb:1430:in `run'' /usr/local/lib/ruby/1.8/drb/drb.rb:1427:in `start'' /usr/local/lib/ruby/1.8/drb/drb.rb:1427:in `run'' /usr/local/lib/ruby/1.8/drb/drb.rb:1347:in `initialize'' /usr/local/lib/ruby/1.8/drb/drb.rb:1627:in `new'' /usr/local/lib/ruby/1.8/drb/drb.rb:1627:in `start_service'' /path/to/my/app/vendor/plugins/backgroundrb/server/lib/ backgroundrb_server.rb:315:in `run'' /usr/local/lib/ruby/gems/1.8/gems/daemons-1.0.7/lib/daemons/ application.rb:194:in `call'' /usr/local/lib/ruby/gems/1.8/gems/daemons-1.0.7/lib/daemons/ application.rb:194:in `start_proc'' /usr/local/lib/ruby/gems/1.8/gems/daemons-1.0.7/lib/daemons/ daemonize.rb:192:in `call'' /usr/local/lib/ruby/gems/1.8/gems/daemons-1.0.7/lib/daemons/ daemonize.rb:192:in `call_as_daemon'' /usr/local/lib/ruby/gems/1.8/gems/daemons-1.0.7/lib/daemons/ application.rb:198:in `start_proc'' /usr/local/lib/ruby/gems/1.8/gems/daemons-1.0.7/lib/daemons/ application.rb:234:in `start'' /usr/local/lib/ruby/gems/1.8/gems/daemons-1.0.7/lib/daemons/ controller.rb:69:in `run'' /usr/local/lib/ruby/gems/1.8/gems/daemons-1.0.7/lib/daemons.rb:185:in `run_proc'' /usr/local/lib/ruby/gems/1.8/gems/daemons-1.0.7/lib/daemons/ cmdline.rb:105:in `call'' /usr/local/lib/ruby/gems/1.8/gems/daemons-1.0.7/lib/daemons/ cmdline.rb:105:in `catch_exceptions'' /usr/local/lib/ruby/gems/1.8/gems/daemons-1.0.7/lib/daemons.rb:184:in `run_proc'' /path/to/my/app/vendor/plugins/backgroundrb/server/lib/ backgroundrb_server.rb:301:in `run'' Thanks, Daniel Azuma
Mason Hale
2007-Jul-30 15:06 UTC
[Backgroundrb-devel] Server dying with perpetual "Connection reset by peer"
On 7/29/07, Daniel Azuma <dazuma at alumni.caltech.edu> wrote:> This appears to be the same issue that Peer Allan observed back in > May-- in that case the workers were triggered on a schedule, but in > my case it is triggered via a remote MiddleMan.By "triggered via a remote MiddleMan" do you mean that one backgroundrb worker is spawning additional workers? If so, then that could be the problem. I had a lot of connection reset by peer errors when trying to spawn workers from other workers. If not, are you using Unix sockets? I''ve seen cases where the process names become very long, due to the class name and job key being concatenated into the process name. If the name is too long, then backgroundrb may not be able to locate a matching socket file to make the connection. If that might be the case, try shortening your class names and/or setting explicit, short jobkeys for your worker processes. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20070730/d6afb87f/attachment.html
Daniel Azuma
2007-Jul-30 21:26 UTC
[Backgroundrb-devel] Server dying with perpetual "Connection reset by peer"
On 30 Jul, 2007, at 08:06, Mason Hale wrote:> On 7/29/07, Daniel Azuma <dazuma at alumni.caltech.edu> wrote: >> This appears to be the same issue that Peer Allan observed back in >> May-- in that case the workers were triggered on a schedule, but in >> my case it is triggered via a remote MiddleMan. > > By "triggered via a remote MiddleMan" do you mean that one > backgroundrb worker is spawning additional workers? If so, then > that could be the problem. I had a lot of connection reset by peer > errors when trying to spawn workers from other workers.Sorry, I didn''t give a very clear description. All I meant was that I''m not using the scheduler (which other people seem to be having trouble with). I''m simply using MiddleMan from rails and spawning workers that way, not trying to spawn workers from other workers. I merely want to use backgroundrb to detach certain jobs from the HTTP request and run them asynchronously.> If not, are you using Unix sockets? I''ve seen cases where the > process names become very long, due to the class name and job key > being concatenated into the process name. If the name is too long, > then backgroundrb may not be able to locate a matching socket file > to make the connection. If that might be the case, try shortening > your class names and/or setting explicit, short jobkeys for your > worker processes.I''ve tried both the drbunix and druby protocols, if that''s what you''re asking, and have had the same result with both. (I assume the difference between them is that the former uses unix sockets whereas the latter uses ip?) However, I''ll look into the process names anyway. Thanks. Daniel
David Balatero
2007-Aug-08 18:09 UTC
[Backgroundrb-devel] Server dying with perpetual "Connection reset by peer"
I was able to solve this very problem by changing my host from "localhost" to "127.0.0.1" in my config/backgroundrb.yml file yesterday. Does that work for you? Some systems apparently have trouble resolving hostnames -- although localhost seems like a no-brainer to resolve... - David On 7/30/07, Daniel Azuma <dazuma at alumni.caltech.edu> wrote:> > > On 30 Jul, 2007, at 08:06, Mason Hale wrote: > > > On 7/29/07, Daniel Azuma <dazuma at alumni.caltech.edu> wrote: > >> This appears to be the same issue that Peer Allan observed back in > >> May-- in that case the workers were triggered on a schedule, but in > >> my case it is triggered via a remote MiddleMan. > > > > By "triggered via a remote MiddleMan" do you mean that one > > backgroundrb worker is spawning additional workers? If so, then > > that could be the problem. I had a lot of connection reset by peer > > errors when trying to spawn workers from other workers. > > Sorry, I didn''t give a very clear description. All I meant was that > I''m not using the scheduler (which other people seem to be having > trouble with). I''m simply using MiddleMan from rails and spawning > workers that way, not trying to spawn workers from other workers. I > merely want to use backgroundrb to detach certain jobs from the HTTP > request and run them asynchronously. > > > If not, are you using Unix sockets? I''ve seen cases where the > > process names become very long, due to the class name and job key > > being concatenated into the process name. If the name is too long, > > then backgroundrb may not be able to locate a matching socket file > > to make the connection. If that might be the case, try shortening > > your class names and/or setting explicit, short jobkeys for your > > worker processes. > > I''ve tried both the drbunix and druby protocols, if that''s what > you''re asking, and have had the same result with both. (I assume the > difference between them is that the former uses unix sockets whereas > the latter uses ip?) However, I''ll look into the process names > anyway. Thanks. > > Daniel > > > _______________________________________________ > Backgroundrb-devel mailing list > Backgroundrb-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/backgroundrb-devel >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20070808/707e5a83/attachment.html
Alessio Bernesco Làvore
2007-Sep-16 23:58 UTC
[Backgroundrb-devel] Using ActiveRecord with CachingSweep.
Hi everyone, I need to sweep a fragment cache when a specific model is destroyed. Usually I''m using a cache_sweeper that observe the "on_destroy" event and then sweep the cache. When the operation is started from inside a controller, by an action, everything works well. But when the action is started from inside a Worker (i.e. a worker destroing objects olders than X hours) i''m not able to trigger the sweeping. Anyone can explain me why is this happening and also give me an advice on how to sweep a cache fragment from inside a worker? Thank you, Alessio Bernesco L?vore ----------------------- www.muteki.it bernesco at muteki.it 338/5614781 -----------------------