Meng Kuan
2008-Jan-16 08:27 UTC
[Backgroundrb-devel] mysterious crash of a particular worker
Hi, I am using the latest checkout from backgroundrb release 1.0.1. I have a worker called status_checker that periodically (every minute) checks the status of certain hosts over the network. It works fine at the beginning but after a while the worker will mysteriously disappear and stop working. I have other workers running but they do not disappear like this worker does. These are the logs generated from logger.info calls for this particular worker. 2008-01-16 15:55 StatusChecker: check method started 2008-01-16 15:55 StatusChecker: processing host head0 2008-01-16 15:55 StatusChecker: processing domain webconsole 2008-01-16 15:55 StatusChecker: processing host host1 2008-01-16 15:55 StatusChecker: error encountered for host host1: RuntimeError: Unable to ssh to 192.168.1.2 as root 2008-01-16 15:55 StatusChecker: check method completed 2008-01-16 15:56 StatusChecker: check method started 2008-01-16 15:56 StatusChecker: processing host head0 2008-01-16 15:56 StatusChecker: processing domain webconsole 2008-01-16 15:56 StatusChecker: processing host host1 2008-01-16 15:56 StatusChecker: error encountered for host host1: RuntimeError: Unable to ssh to 192.168.1.2 as root 2008-01-16 15:56 StatusChecker: check method completed 2008-01-16 15:57 StatusChecker: check method started 2008-01-16 15:57 StatusChecker: processing host head0 2008-01-16 15:57 StatusChecker: processing domain webconsole 2008-01-16 15:57 StatusChecker: processing host host1 2008-01-16 15:57 StatusChecker: error encountered for host host1: RuntimeError: Unable to ssh to 192.168.1.2 as root 2008-01-16 15:57 StatusChecker: check method completed 2008-01-16 15:58 StatusChecker: check method started 2008-01-16 15:58 StatusChecker: processing host head0 2008-01-16 15:58 StatusChecker: processing domain webconsole 2008-01-16 15:58 StatusChecker: processing host host1 2008-01-16 15:58 StatusChecker: error encountered for host host1: RuntimeError: Unable to ssh to 192.168.1.2 as root 2008-01-16 15:58 StatusChecker: check method completed 2008-01-16 15:59 StatusChecker: check method started 2008-01-16 15:59 StatusChecker: processing host head0 2008-01-16 15:59 StatusChecker: processing domain webconsole 2008-01-16 15:59 StatusChecker: processing host host1 2008-01-16 15:59 StatusChecker: error encountered for host host1: RuntimeError: Unable to ssh to 192.168.1.2 as root 2008-01-16 15:59 StatusChecker: check method completed 2008-01-16 16:00 StatusChecker: check method started 2008-01-16 16:00 StatusChecker: processing host head0 2008-01-16 16:00 StatusChecker: processing domain webconsole 2008-01-16 16:00 StatusChecker: processing host host1 2008-01-16 16:00 StatusChecker: error encountered for host host1: RuntimeError: Unable to ssh to 192.168.1.2 as root 2008-01-16 16:00 StatusChecker: check method completed 2008-01-16 16:01 StatusChecker: check method started 2008-01-16 16:01 StatusChecker: processing host head0 2008-01-16 16:01 StatusChecker: processing domain webconsole 2008-01-16 16:01 StatusChecker: processing host host1 2008-01-16 16:01 StatusChecker: error encountered for host host1: RuntimeError: Unable to ssh to 192.168.1.2 as root 2008-01-16 16:01 StatusChecker: check method completed 2008-01-16 16:02 StatusChecker: check method started 2008-01-16 16:02 StatusChecker: processing host head0 2008-01-16 16:02 StatusChecker: processing domain webconsole 2008-01-16 16:02 StatusChecker: processing host host1 2008-01-16 16:02 StatusChecker: error encountered for host host1: RuntimeError: Unable to ssh to 192.168.1.2 as root 2008-01-16 16:02 StatusChecker: check method completed 2008-01-16 16:03 StatusChecker: check method started 2008-01-16 16:03 StatusChecker: processing host head0 2008-01-16 16:03 StatusChecker: processing domain webconsole 2008-01-16 16:03 StatusChecker: processing host host1 2008-01-16 16:03 StatusChecker: error encountered for host host1: RuntimeError: Unable to ssh to 192.168.1.2 as root 2008-01-16 16:03 StatusChecker: check method completed 2008-01-16 16:04 StatusChecker: check method started 2008-01-16 16:04 StatusChecker: processing host head0 2008-01-16 16:04 StatusChecker: processing domain webconsole 2008-01-16 16:04 StatusChecker: processing host host1 2008-01-16 16:04 StatusChecker: error encountered for host host1: RuntimeError: Unable to ssh to 192.168.1.2 as root 2008-01-16 16:04 StatusChecker: check method completed 2008-01-16 16:05 StatusChecker: check method started 2008-01-16 16:05 StatusChecker: processing host head0 2008-01-16 16:05 StatusChecker: processing domain webconsole 2008-01-16 16:05 StatusChecker: processing host host1 2008-01-16 16:05 StatusChecker: error encountered for host host1: RuntimeError: Unable to ssh to 192.168.1.2 as root 2008-01-16 16:05 StatusChecker: check method completed 2008-01-16 16:06 StatusChecker: check method started 2008-01-16 16:06 StatusChecker: processing host head0 2008-01-16 16:06 StatusChecker: processing domain webconsole 2008-01-16 16:06 StatusChecker: processing host host1 2008-01-16 16:06 StatusChecker: error encountered for host host1: RuntimeError: Unable to ssh to 192.168.1.2 as root 2008-01-16 16:06 StatusChecker: check method completed 2008-01-16 16:07 StatusChecker: check method started 2008-01-16 16:07 StatusChecker: processing host head0 2008-01-16 16:07 StatusChecker: processing domain webconsole 2008-01-16 16:07 StatusChecker: processing host host1 2008-01-16 16:07 StatusChecker: error encountered for host host1: RuntimeError: Unable to ssh to 192.168.1.2 as root 2008-01-16 16:07 StatusChecker: check method completed 2008-01-16 16:08 StatusChecker: check method started 2008-01-16 16:08 StatusChecker: processing host head0 2008-01-16 16:08 StatusChecker: processing domain webconsole 2008-01-16 16:08 StatusChecker: processing host host1 2008-01-16 16:08 StatusChecker: error encountered for host host1: RuntimeError: Unable to ssh to 192.168.1.2 as root 2008-01-16 16:08 StatusChecker: check method completed 2008-01-16 16:09 StatusChecker: check method started 2008-01-16 16:09 StatusChecker: processing host head0 2008-01-16 16:09 StatusChecker: processing domain webconsole 2008-01-16 16:09 StatusChecker: processing host host1 2008-01-16 16:09 StatusChecker: error encountered for host host1: RuntimeError: Unable to ssh to 192.168.1.2 as root 2008-01-16 16:09 StatusChecker: check method completed 2008-01-16 16:10 StatusChecker: check method started 2008-01-16 16:10 StatusChecker: processing host head0 2008-01-16 16:10 StatusChecker: processing domain webconsole 2008-01-16 16:10 StatusChecker: processing host host1 After 16:10 the worker seems to have disappeared. I found the following message in backgroundrb_debug.log: /home/webcon/apps/webconsole/releases/20080115103838/vendor/plugins/ backgroundrb/server/meta_worker.rb:329:in `check_for _timer_events'': undefined method `info'' for nil:NilClass (NoMethodError) from /home/webcon/apps/webconsole/releases/20080115103838/ vendor/plugins/backgroundrb/framework/core.rb:148:in ` start_reactor'' from /home/webcon/apps/webconsole/releases/20080115103838/ vendor/plugins/backgroundrb/framework/core.rb:147:in ` loop'' from /home/webcon/apps/webconsole/releases/20080115103838/ vendor/plugins/backgroundrb/framework/core.rb:147:in ` start_reactor'' from /home/webcon/apps/webconsole/releases/20080115103838/ vendor/plugins/backgroundrb/framework/worker.rb:21:in `start_worker'' from /home/webcon/apps/webconsole/releases/20080115103838/ vendor/plugins/backgroundrb/framework/packet_master.rb :134:in `fork_and_load'' from /home/webcon/apps/webconsole/releases/20080115103838/ vendor/plugins/backgroundrb/framework/packet_master.rb :98:in `load_workers'' from /home/webcon/apps/webconsole/releases/20080115103838/ vendor/plugins/backgroundrb/framework/packet_master.rb :93:in `each'' from /home/webcon/apps/webconsole/releases/20080115103838/ vendor/plugins/backgroundrb/framework/packet_master.rb :93:in `load_workers'' from /home/webcon/apps/webconsole/releases/20080115103838/ vendor/plugins/backgroundrb/framework/packet_master.rb :19:in `run'' from /home/webcon/apps/webconsole/releases/20080115103838/ vendor/plugins/backgroundrb/server/master_worker.rb:16 3:in `initialize'' from script/backgroundrb:41:in `new'' from script/backgroundrb:41 The code for this worker can be found here: svn.sxven.com/webconsole/trunk/lib/workers/status_checker.rb Any pointers on how to debug this is much appreciated. cheers, mengkuan
Hi Meng, On Jan 16, 2008 1:57 PM, Meng Kuan <mengkuan at gmail.com> wrote:> Hi, > > I am using the latest checkout from backgroundrb release 1.0.1. I > have a worker called status_checker that periodically (every minute) > checks the status of certain hosts over the network. It works fine at > the beginning but after a while the worker will mysteriously > disappear and stop working. I have other workers running but they do > not disappear like this worker does. > > These are the logs generated from logger.info calls for this > particular worker. > > 2008-01-16 15:55 StatusChecker: check method started > 2008-01-16 15:55 StatusChecker: processing host head0 > 2008-01-16 15:55 StatusChecker: processing domain webconsole > 2008-01-16 15:55 StatusChecker: processing host host1 > 2008-01-16 15:55 StatusChecker: error encountered for host host1: > RuntimeError: Unable to ssh to 192.168.1.2 as root > 2008-01-16 15:55 StatusChecker: check method completed > 2008-01-16 15:56 StatusChecker: check method started > 2008-01-16 15:56 StatusChecker: processing host head0 > 2008-01-16 15:56 StatusChecker: processing domain webconsole > 2008-01-16 15:56 StatusChecker: processing host host1 > 2008-01-16 15:56 StatusChecker: error encountered for host host1: > RuntimeError: Unable to ssh to 192.168.1.2 as root > 2008-01-16 15:56 StatusChecker: check method completed > 2008-01-16 15:57 StatusChecker: check method started > 2008-01-16 15:57 StatusChecker: processing host head0 > 2008-01-16 15:57 StatusChecker: processing domain webconsole > 2008-01-16 15:57 StatusChecker: processing host host1 > 2008-01-16 15:57 StatusChecker: error encountered for host host1: > RuntimeError: Unable to ssh to 192.168.1.2 as root > 2008-01-16 15:57 StatusChecker: check method completed > 2008-01-16 15:58 StatusChecker: check method started > 2008-01-16 15:58 StatusChecker: processing host head0 > 2008-01-16 15:58 StatusChecker: processing domain webconsole > 2008-01-16 15:58 StatusChecker: processing host host1 > 2008-01-16 15:58 StatusChecker: error encountered for host host1: > RuntimeError: Unable to ssh to 192.168.1.2 as root > 2008-01-16 15:58 StatusChecker: check method completed > 2008-01-16 15:59 StatusChecker: check method started > 2008-01-16 15:59 StatusChecker: processing host head0 > 2008-01-16 15:59 StatusChecker: processing domain webconsole > 2008-01-16 15:59 StatusChecker: processing host host1 > 2008-01-16 15:59 StatusChecker: error encountered for host host1: > RuntimeError: Unable to ssh to 192.168.1.2 as root > 2008-01-16 15:59 StatusChecker: check method completed > 2008-01-16 16:00 StatusChecker: check method started > 2008-01-16 16:00 StatusChecker: processing host head0 > 2008-01-16 16:00 StatusChecker: processing domain webconsole > 2008-01-16 16:00 StatusChecker: processing host host1 > 2008-01-16 16:00 StatusChecker: error encountered for host host1: > RuntimeError: Unable to ssh to 192.168.1.2 as root > 2008-01-16 16:00 StatusChecker: check method completed > 2008-01-16 16:01 StatusChecker: check method started > 2008-01-16 16:01 StatusChecker: processing host head0 > 2008-01-16 16:01 StatusChecker: processing domain webconsole > 2008-01-16 16:01 StatusChecker: processing host host1 > 2008-01-16 16:01 StatusChecker: error encountered for host host1: > RuntimeError: Unable to ssh to 192.168.1.2 as root > 2008-01-16 16:01 StatusChecker: check method completed > 2008-01-16 16:02 StatusChecker: check method started > 2008-01-16 16:02 StatusChecker: processing host head0 > 2008-01-16 16:02 StatusChecker: processing domain webconsole > 2008-01-16 16:02 StatusChecker: processing host host1 > 2008-01-16 16:02 StatusChecker: error encountered for host host1: > RuntimeError: Unable to ssh to 192.168.1.2 as root > 2008-01-16 16:02 StatusChecker: check method completed > 2008-01-16 16:03 StatusChecker: check method started > 2008-01-16 16:03 StatusChecker: processing host head0 > 2008-01-16 16:03 StatusChecker: processing domain webconsole > 2008-01-16 16:03 StatusChecker: processing host host1 > 2008-01-16 16:03 StatusChecker: error encountered for host host1: > RuntimeError: Unable to ssh to 192.168.1.2 as root > 2008-01-16 16:03 StatusChecker: check method completed > 2008-01-16 16:04 StatusChecker: check method started > 2008-01-16 16:04 StatusChecker: processing host head0 > 2008-01-16 16:04 StatusChecker: processing domain webconsole > 2008-01-16 16:04 StatusChecker: processing host host1 > 2008-01-16 16:04 StatusChecker: error encountered for host host1: > RuntimeError: Unable to ssh to 192.168.1.2 as root > 2008-01-16 16:04 StatusChecker: check method completed > 2008-01-16 16:05 StatusChecker: check method started > 2008-01-16 16:05 StatusChecker: processing host head0 > 2008-01-16 16:05 StatusChecker: processing domain webconsole > 2008-01-16 16:05 StatusChecker: processing host host1 > 2008-01-16 16:05 StatusChecker: error encountered for host host1: > RuntimeError: Unable to ssh to 192.168.1.2 as root > 2008-01-16 16:05 StatusChecker: check method completed > 2008-01-16 16:06 StatusChecker: check method started > 2008-01-16 16:06 StatusChecker: processing host head0 > 2008-01-16 16:06 StatusChecker: processing domain webconsole > 2008-01-16 16:06 StatusChecker: processing host host1 > 2008-01-16 16:06 StatusChecker: error encountered for host host1: > RuntimeError: Unable to ssh to 192.168.1.2 as root > 2008-01-16 16:06 StatusChecker: check method completed > 2008-01-16 16:07 StatusChecker: check method started > 2008-01-16 16:07 StatusChecker: processing host head0 > 2008-01-16 16:07 StatusChecker: processing domain webconsole > 2008-01-16 16:07 StatusChecker: processing host host1 > 2008-01-16 16:07 StatusChecker: error encountered for host host1: > RuntimeError: Unable to ssh to 192.168.1.2 as root > 2008-01-16 16:07 StatusChecker: check method completed > 2008-01-16 16:08 StatusChecker: check method started > 2008-01-16 16:08 StatusChecker: processing host head0 > 2008-01-16 16:08 StatusChecker: processing domain webconsole > 2008-01-16 16:08 StatusChecker: processing host host1 > 2008-01-16 16:08 StatusChecker: error encountered for host host1: > RuntimeError: Unable to ssh to 192.168.1.2 as root > 2008-01-16 16:08 StatusChecker: check method completed > 2008-01-16 16:09 StatusChecker: check method started > 2008-01-16 16:09 StatusChecker: processing host head0 > 2008-01-16 16:09 StatusChecker: processing domain webconsole > 2008-01-16 16:09 StatusChecker: processing host host1 > 2008-01-16 16:09 StatusChecker: error encountered for host host1: > RuntimeError: Unable to ssh to 192.168.1.2 as root > 2008-01-16 16:09 StatusChecker: check method completed > 2008-01-16 16:10 StatusChecker: check method started > 2008-01-16 16:10 StatusChecker: processing host head0 > 2008-01-16 16:10 StatusChecker: processing domain webconsole > 2008-01-16 16:10 StatusChecker: processing host host1 > > After 16:10 the worker seems to have disappeared. I found the > following message in backgroundrb_debug.log: > > /home/webcon/apps/webconsole/releases/20080115103838/vendor/plugins/ > backgroundrb/server/meta_worker.rb:329:in `check_for > _timer_events'': undefined method `info'' for nil:NilClass (NoMethodError) > from /home/webcon/apps/webconsole/releases/20080115103838/ > vendor/plugins/backgroundrb/framework/core.rb:148:in ` > start_reactor'' > from /home/webcon/apps/webconsole/releases/20080115103838/ > vendor/plugins/backgroundrb/framework/core.rb:147:in ` > loop'' > from /home/webcon/apps/webconsole/releases/20080115103838/ > vendor/plugins/backgroundrb/framework/core.rb:147:in ` > start_reactor'' > from /home/webcon/apps/webconsole/releases/20080115103838/ > vendor/plugins/backgroundrb/framework/worker.rb:21:in > `start_worker'' > from /home/webcon/apps/webconsole/releases/20080115103838/ > vendor/plugins/backgroundrb/framework/packet_master.rb > :134:in `fork_and_load'' > from /home/webcon/apps/webconsole/releases/20080115103838/ > vendor/plugins/backgroundrb/framework/packet_master.rb > :98:in `load_workers'' > from /home/webcon/apps/webconsole/releases/20080115103838/ > vendor/plugins/backgroundrb/framework/packet_master.rb > :93:in `each'' > from /home/webcon/apps/webconsole/releases/20080115103838/ > vendor/plugins/backgroundrb/framework/packet_master.rb > :93:in `load_workers'' > from /home/webcon/apps/webconsole/releases/20080115103838/ > vendor/plugins/backgroundrb/framework/packet_master.rb > :19:in `run'' > from /home/webcon/apps/webconsole/releases/20080115103838/ > vendor/plugins/backgroundrb/server/master_worker.rb:16 > 3:in `initialize'' > from script/backgroundrb:41:in `new'' > from script/backgroundrb:41 > > > The code for this worker can be found here: > > svn.sxven.com/webconsole/trunk/lib/workers/status_checker.rb > > > Any pointers on how to debug this is much appreciated. >Can you sync your plugin with trunk and see if its fixed? Looks like we had a bug in logger functionality. -- Let them talk of their oriental summer climes of everlasting conservatories; give me the privilege of making my own summer with my own coals. gnufied.org
Meng Kuan
2008-Jan-18 10:24 UTC
[Backgroundrb-devel] mysterious crash of a particular worker
On 16 Jan 2008, at 11:16 PM, hemant wrote:> > Can you sync your plugin with trunk and see if its fixed? Looks like > we had a bug in logger functionality. >Using trunk instead of release-1.0.1 seems to have fixed the problem. Ran the workers for 5 hours and no more mysterious disappearances so far. Thanks, hemant, and great work on the plugin!
Apparently Analagous Threads
- DO NOT REPLY [Bug 3740] New: --delete with -a failes with warning (though -a is supposed to imply -r)
- 3 Bugs to Report: OpenSSH V1.2pre13
- host type removing aliases
- [Bug 3639] New: server thread aborts during client login after receiving SSH2_MSG_KEXINIT
- Centos 4.7 /var/log/messages file kill process