Hi all, The last 2 weeks we have been having problems with puppetd just randomly stopping. The only thing the log shows is: <snip> Apr 21 12:14:59 relay puppetd[1376]: Finished catalog run in 4.78 seconds Apr 21 12:30:07 relay puppetd[1376]: Finished catalog run in 4.76 seconds Apr 21 12:45:20 relay puppetd[1376]: Finished catalog run in 4.61 seconds </snip> There is no pattern when the process stops but its usually between 15 min and 5 hours (And ''only'' on about 50 nodes). Running a strace on the puppetd gives that puppetd recieves a SIGINT and then exits .. I have tried to disable most modules (We do have a few we cannot disable) but still the processes stops. We are running version 0.25.4-2 on both master and slave with a haproxy frontend since we have 2 servers (But one is disabled in the search for this random error) haproxy has been configured with high timeouts since it can take up til 177 seconds to process a node <from haproxy.cfg> contimeout 35000 clitimeout 350000 srvtimeout 350000 </> the apache configuration of passenger has the following values: <from apache.vhost> PassengerPoolIdleTime 900 PassengerMaxPoolSize 30 PassengerUseGlobalQueue on PassengerHighPerformance on RackAutoDetect On </> I have asked on #puppet@freenode if anyone had an idea on how to track this down since its becomming more of a pain to start puppetd every 15 minutes. I haven''t been able to get the timeout when running with --debug --trace Our suspicion comes down to its a problem with a timeout since usually it stops after a high catalog run time. So ... Any idea on how to track this down ? _any_ input is welcome /Kim -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1> Our suspicion comes down to its a problem with a timeout since > usually it stops after a high catalog run time. > > So ... Any idea on how to track this down ? _any_ input is welcomeNo except, that I see the problem as well: http://projects.reductivelabs.com/issues/2888 http://projects.reductivelabs.com/issues/2661 I assume it is related to timeout issues, however those are as you said very hard to track... cheers pete -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkvO56IACgkQbwltcAfKi38cagCglNb6v2ICEB/gRxXxHVneEv6x 2D4AoIHUl3lO09RQw5ykiAlDMJ5bu2/A =usNn -----END PGP SIGNATURE----- -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
More data is needed I think. Can you run puppetd --no-daemonize --debug in ''screen'' or by piping the output somewhere? It may give you a better clue. On Apr 21, 12:39 pm, Kim Gert Nielsen <k...@netgroup.dk> wrote:> Hi all, > > The last 2 weeks we have been having problems with puppetd just randomly stopping. The only thing the log shows is: > > <snip> > Apr 21 12:14:59 relay puppetd[1376]: Finished catalog run in 4.78 seconds > Apr 21 12:30:07 relay puppetd[1376]: Finished catalog run in 4.76 seconds > Apr 21 12:45:20 relay puppetd[1376]: Finished catalog run in 4.61 seconds > </snip> > > There is no pattern when the process stops but its usually between 15 min and 5 hours (And ''only'' on about 50 nodes). Running a strace on the puppetd gives that puppetd recieves a SIGINT and then exits .. I have tried to disable most modules (We do have a few we cannot disable) but still the processes stops. > > We are running version 0.25.4-2 on both master and slave with a haproxy frontend since we have 2 servers (But one is disabled in the search for this random error) > > haproxy has been configured with high timeouts since it can take up til 177 seconds to process a node > > <from haproxy.cfg> > contimeout 35000 > clitimeout 350000 > srvtimeout 350000 > </> > > the apache configuration of passenger has the following values: > > <from apache.vhost> > PassengerPoolIdleTime 900 > PassengerMaxPoolSize 30 > PassengerUseGlobalQueue on > PassengerHighPerformance on > RackAutoDetect On > </> > > I have asked on #puppet@freenode if anyone had an idea on how to track this down since its becomming more of a pain to start puppetd every 15 minutes. I haven''t been able to get the timeout when running with --debug --trace > > Our suspicion comes down to its a problem with a timeout since usually it stops after a high catalog run time. > > So ... Any idea on how to track this down ? _any_ input is welcome > > /Kim > > -- > You received this message because you are subscribed to the Google Groups "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group athttp://groups.google.com/group/puppet-users?hl=en.-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
No except, that I see the problem as well: http://projects.reductivelabs.com/issues/2888 http://projects.reductivelabs.com/issues/2661 Well I can see that these 2 problems actually logs .. mine just gives nothing .. I have been running puppetd in screens but so far nothing. /Kim -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Apr 21, 2010, at 2:51 PM, Ken wrote:> More data is needed I think. > > Can you run puppetd --no-daemonize --debug in ''screen'' or by piping > the output somewhere? It may give you a better clue. >Well I partly got something info: Filebucket[/var/lib/puppet/clientbucket]: Adding /etc/puppet/puppet.conf(f855be601e533a86b2c86a1e48e40281) info: //puppet::client/File[/etc/puppet/puppet.conf]: Filebucketed /etc/puppet/puppet.conf to main with sum f855be601e533a86b2c86a1e48e40281 debug: //puppet::client/File[/etc/puppet/puppet.conf]/checksum: Replacing /etc/puppet/puppet.conf checksum {md5}f855be601e533a86b2c86a1e48e40281 with {md5}caeae0319caee30f24bb280916242f29 notice: //puppet::client/File[/etc/puppet/puppet.conf]/content: content changed ''{md5}f855be601e533a86b2c86a1e48e40281'' to ''unknown checksum'' info: //puppet::client/File[/etc/puppet/puppet.conf]: Scheduling refresh of Service[puppet] debug: Format pson not supported for Puppet::FileServing::Metadata; has not implemented method ''from_pson'' debug: Format s not supported for Puppet::FileServing::Metadata; has not implemented method ''from_s'' debug: Service[puppet](provider=debian): Executing ''ps -ef'' debug: Service[puppet](provider=debian): PID is 29733 notice: //puppet/Service[puppet]: Triggering ''refresh'' from 1 dependencies debug: Service[puppet](provider=debian): Executing ''ps -ef'' debug: Service[puppet](provider=debian): PID is 29733 debug: Service[puppet](provider=debian): Executing ''/etc/init.d/puppet restart'' notice: Caught TERM; calling stop so it see an update to puppet.conf and runs a restart .. then it stops .. then I guess the hack with 1 sec delay is not enough /Kim -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
nevermind .. someone actually made a change so it restartet just as it should leavning my screen killed /Kim On Apr 21, 2010, at 8:43 PM, Kim Gert Nielsen wrote:> > info: Filebucket[/var/lib/puppet/clientbucket]: Adding /etc/puppet/puppet.conf(f855be601e533a86b2c86a1e48e40281) > info: //puppet::client/File[/etc/puppet/puppet.conf]: Filebucketed /etc/puppet/puppet.conf to main with sum f855be601e533a86b2c86a1e48e40281 > debug: //puppet::client/File[/etc/puppet/puppet.conf]/checksum: Replacing /etc/puppet/puppet.conf checksum {md5}f855be601e533a86b2c86a1e48e40281 with {md5}caeae0319caee30f24bb280916242f29 > notice: //puppet::client/File[/etc/puppet/puppet.conf]/content: content changed ''{md5}f855be601e533a86b2c86a1e48e40281'' to ''unknown checksum'' > info: //puppet::client/File[/etc/puppet/puppet.conf]: Scheduling refresh of Service[puppet] > debug: Format pson not supported for Puppet::FileServing::Metadata; has not implemented method ''from_pson'' > debug: Format s not supported for Puppet::FileServing::Metadata; has not implemented method ''from_s'' > debug: Service[puppet](provider=debian): Executing ''ps -ef'' > debug: Service[puppet](provider=debian): PID is 29733 > notice: //puppet/Service[puppet]: Triggering ''refresh'' from 1 dependencies > debug: Service[puppet](provider=debian): Executing ''ps -ef'' > debug: Service[puppet](provider=debian): PID is 29733 > debug: Service[puppet](provider=debian): Executing ''/etc/init.d/puppet restart'' > notice: Caught TERM; calling stop >-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Apr 21, 2010, at 11:43 AM, Kim Gert Nielsen wrote:> > On Apr 21, 2010, at 2:51 PM, Ken wrote: > >> More data is needed I think. >> >> Can you run puppetd --no-daemonize --debug in ''screen'' or by piping >> the output somewhere? It may give you a better clue. >> > > debug: Service[puppet](provider=debian): Executing ''ps -ef'' > debug: Service[puppet](provider=debian): PID is 29733 > debug: Service[puppet](provider=debian): Executing ''/etc/init.d/puppet restart'' > notice: Caught TERM; calling stop > > so it see an update to puppet.conf and runs a restart .. then it stops .. then I guess the hack with 1 sec delay is not enoughIs restarting puppet using itself supported? I had always assumed it wasn''t. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Apr 21, 2010, at 8:50 PM, Patrick wrote:> > Is restarting puppet using itself supported? I had always assumed it wasn''t. >I got the example long time ago from example42 and they just added a service for it. It has worked before but if its unsupported then it might be the problem I have :) -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Apr 21, 2010, at 5:51 AM, Ken wrote:> More data is needed I think. > > Can you run puppetd --no-daemonize --debug in ''screen'' or by piping > the output somewhere? It may give you a better clue.First, it think your saying that the client is crashing or hanging. My advice would be to do this with most of you computers: Use puppet to push out a cron job that will either fix puppet, or run puppet. ie either have it just run puppet in cron, or have it run a command to make sure puppet''s running fine every hour. Then setup a few you can watch for debugging. I''d do something like "puppetd --no-daemonize --verbose --debug --trace | tee /root/puppet.log". As Ken said, running it in screen would help if you get disconnected. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Apr 21, 2010, at 12:01 PM, Kim Gert Nielsen wrote:> > On Apr 21, 2010, at 8:50 PM, Patrick wrote: > >> >> Is restarting puppet using itself supported? I had always assumed it wasn''t. >> > > I got the example long time ago from example42 and they just added a service for it. It has worked before but if its unsupported then it might be the problem I have :)I have no idea if it''s supported. I just always assumed it was a bad idea. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
puppet will re-parse its config file if it is changing, so usually no need to restart the daemon. saying that, imho, its much better to run puppet via a cron. Ohad On Thu, Apr 22, 2010 at 3:48 AM, Patrick <kc7zzv@gmail.com> wrote:> > On Apr 21, 2010, at 12:01 PM, Kim Gert Nielsen wrote: > > > > > On Apr 21, 2010, at 8:50 PM, Patrick wrote: > > > >> > >> Is restarting puppet using itself supported? I had always assumed it > wasn''t. > >> > > > > I got the example long time ago from example42 and they just added a > service for it. It has worked before but if its unsupported then it might be > the problem I have :) > > I have no idea if it''s supported. I just always assumed it was a bad idea. > > -- > You received this message because you are subscribed to the Google Groups > "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to > puppet-users+unsubscribe@googlegroups.com<puppet-users%2Bunsubscribe@googlegroups.com> > . > For more options, visit this group at > http://groups.google.com/group/puppet-users?hl=en. > >-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Apr 21, 2010, at 2:51 PM, Ken wrote:> More data is needed I think. > > Can you run puppetd --no-daemonize --debug in ''screen'' or by piping > the output somewhere? It may give you a better clue. >All I get is this: /usr/lib/ruby/1.8/net/protocol.rb:135:in `sysread'' /usr/lib/ruby/1.8/net/protocol.rb:135:in `rbuf_fill'' /usr/lib/ruby/1.8/timeout.rb:62:in `timeout'' /usr/lib/ruby/1.8/timeout.rb:93:in `timeout'' /usr/lib/ruby/1.8/net/protocol.rb:134:in `rbuf_fill'' /usr/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'' /usr/lib/ruby/1.8/net/protocol.rb:126:in `readline'' /usr/lib/ruby/1.8/net/http.rb:2020:in `read_status_line'' /usr/lib/ruby/1.8/net/http.rb:2009:in `read_new'' /usr/lib/ruby/1.8/net/http.rb:1050:in `request'' /usr/lib/ruby/1.8/net/http.rb:1037:in `request'' /usr/lib/ruby/1.8/net/http.rb:543:in `start'' /usr/lib/ruby/1.8/net/http.rb:1035:in `request'' /usr/lib/ruby/1.8/net/http.rb:772:in `get'' /usr/lib/ruby/1.8/puppet/indirector/rest.rb:69:in `find'' /usr/lib/ruby/1.8/puppet/indirector/indirection.rb:198:in `find'' /usr/lib/ruby/1.8/puppet/indirector.rb:51:in `find'' /usr/lib/ruby/1.8/puppet/configurer.rb:94:in `retrieve_catalog'' /usr/lib/ruby/1.8/puppet/util.rb:418:in `thinmark'' /usr/lib/ruby/1.8/benchmark.rb:308:in `realtime'' /usr/lib/ruby/1.8/puppet/util.rb:417:in `thinmark'' /usr/lib/ruby/1.8/puppet/configurer.rb:93:in `retrieve_catalog'' /usr/lib/ruby/1.8/puppet/configurer.rb:145:in `run'' /usr/lib/ruby/1.8/puppet/agent.rb:53:in `run'' /usr/lib/ruby/1.8/puppet/agent/locker.rb:21:in `lock'' /usr/lib/ruby/1.8/puppet/agent.rb:53:in `run'' /usr/lib/ruby/1.8/sync.rb:230:in `synchronize'' /usr/lib/ruby/1.8/puppet/agent.rb:53:in `run'' /usr/lib/ruby/1.8/puppet/agent.rb:130:in `with_client'' /usr/lib/ruby/1.8/puppet/agent.rb:51:in `run'' /usr/lib/ruby/1.8/puppet/application/puppetd.rb:103:in `onetime'' /usr/lib/ruby/1.8/puppet/application.rb:226:in `send'' /usr/lib/ruby/1.8/puppet/application.rb:226:in `run_command'' /usr/lib/ruby/1.8/puppet/application.rb:217:in `run'' /usr/lib/ruby/1.8/puppet/application.rb:306:in `exit_on_fail'' /usr/lib/ruby/1.8/puppet/application.rb:217:in `run'' /usr/sbin/puppetd:159 -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Are we missing the top of that stack trace Kim? FYI If you use Ctrl-A + [ you can scrollback in screen. ken. On Apr 23, 8:54 am, Kim Gert Nielsen <k...@netgroup.dk> wrote:> On Apr 21, 2010, at 2:51 PM, Ken wrote: > > > More data is needed I think. > > > Can you run puppetd --no-daemonize --debug in ''screen'' or by piping > > the output somewhere? It may give you a better clue. > > All I get is this: > > /usr/lib/ruby/1.8/net/protocol.rb:135:in `sysread'' > /usr/lib/ruby/1.8/net/protocol.rb:135:in `rbuf_fill'' > /usr/lib/ruby/1.8/timeout.rb:62:in `timeout'' > /usr/lib/ruby/1.8/timeout.rb:93:in `timeout'' > /usr/lib/ruby/1.8/net/protocol.rb:134:in `rbuf_fill'' > /usr/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'' > /usr/lib/ruby/1.8/net/protocol.rb:126:in `readline'' > /usr/lib/ruby/1.8/net/http.rb:2020:in `read_status_line'' > /usr/lib/ruby/1.8/net/http.rb:2009:in `read_new'' > /usr/lib/ruby/1.8/net/http.rb:1050:in `request'' > /usr/lib/ruby/1.8/net/http.rb:1037:in `request'' > /usr/lib/ruby/1.8/net/http.rb:543:in `start'' > /usr/lib/ruby/1.8/net/http.rb:1035:in `request'' > /usr/lib/ruby/1.8/net/http.rb:772:in `get'' > /usr/lib/ruby/1.8/puppet/indirector/rest.rb:69:in `find'' > /usr/lib/ruby/1.8/puppet/indirector/indirection.rb:198:in `find'' > /usr/lib/ruby/1.8/puppet/indirector.rb:51:in `find'' > /usr/lib/ruby/1.8/puppet/configurer.rb:94:in `retrieve_catalog'' > /usr/lib/ruby/1.8/puppet/util.rb:418:in `thinmark'' > /usr/lib/ruby/1.8/benchmark.rb:308:in `realtime'' > /usr/lib/ruby/1.8/puppet/util.rb:417:in `thinmark'' > /usr/lib/ruby/1.8/puppet/configurer.rb:93:in `retrieve_catalog'' > /usr/lib/ruby/1.8/puppet/configurer.rb:145:in `run'' > /usr/lib/ruby/1.8/puppet/agent.rb:53:in `run'' > /usr/lib/ruby/1.8/puppet/agent/locker.rb:21:in `lock'' > /usr/lib/ruby/1.8/puppet/agent.rb:53:in `run'' > /usr/lib/ruby/1.8/sync.rb:230:in `synchronize'' > /usr/lib/ruby/1.8/puppet/agent.rb:53:in `run'' > /usr/lib/ruby/1.8/puppet/agent.rb:130:in `with_client'' > /usr/lib/ruby/1.8/puppet/agent.rb:51:in `run'' > /usr/lib/ruby/1.8/puppet/application/puppetd.rb:103:in `onetime'' > /usr/lib/ruby/1.8/puppet/application.rb:226:in `send'' > /usr/lib/ruby/1.8/puppet/application.rb:226:in `run_command'' > /usr/lib/ruby/1.8/puppet/application.rb:217:in `run'' > /usr/lib/ruby/1.8/puppet/application.rb:306:in `exit_on_fail'' > /usr/lib/ruby/1.8/puppet/application.rb:217:in `run'' > /usr/sbin/puppetd:159 > > -- > You received this message because you are subscribed to the Google Groups "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group athttp://groups.google.com/group/puppet-users?hl=en.-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.