I''m using 0.24 with Webrick (in the process of migrating to 0.25 / passenger). Occasionally, the puppetmasterd becomes unavailable, and we see error messages along the lines of: Could not call puppetmaster.getconfig: #<Errno::ECONNRESET: Connection reset by peer> I believe the puppetmasterd does not completely die, so it is still in the process list. I''m wondering what a good way to monitor this would be. I see that I can telnet into port 8140, is there something simple I can send that would give me an indication that everything is okay or not? Any suggestions on monitoring this would be appreciated. Pete --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Tue, Sep 22, 2009 at 1:38 PM, Pete Emerson <pemerson@gmail.com> wrote:> > I''m using 0.24 with Webrick (in the process of migrating to 0.25 / passenger). > > Occasionally, the puppetmasterd becomes unavailable, and we see error > messages along the lines of: > > Could not call puppetmaster.getconfig: #<Errno::ECONNRESET: Connection > reset by peer> > > I believe the puppetmasterd does not completely die, so it is still in > the process list. > > I''m wondering what a good way to monitor this would be. > > I see that I can telnet into port 8140, is there something simple I > can send that would give me an indication that everything is okay or > not? > > Any suggestions on monitoring this would be appreciated. > > PeteStrangely enough, we had a similar situation here. Whenever Puppet (the client) would reload its own configs, it would die and not start again. So we wrote a nice little nagios script that monitored whether puppet was running on each machine. From that, we could restart or send an email or anything we want really. Cheers, Clint --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
We had this issue while we were using webrick and ended up with the following in cron running every 15 minutes: if [ `/usr/bin/puppetlast |grep -v puppetlast |sort -n -k 4 |head -n 1|awk ''{print \$4}''` -ge 15 ]; then /etc/init.d/puppetmaster restart ;fi - Jeff On 09/22/2009 02:38 PM, Pete Emerson wrote:> > I''m using 0.24 with Webrick (in the process of migrating to 0.25 / passenger). > > Occasionally, the puppetmasterd becomes unavailable, and we see error > messages along the lines of: > > Could not call puppetmaster.getconfig: #<Errno::ECONNRESET: Connection > reset by peer> > > I believe the puppetmasterd does not completely die, so it is still in > the process list. > > I''m wondering what a good way to monitor this would be. > > I see that I can telnet into port 8140, is there something simple I > can send that would give me an indication that everything is okay or > not? > > Any suggestions on monitoring this would be appreciated. > > Pete > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Sounds promising. Did puppetlast appear after 0.24.7-4? I don''t see it on my box. [root@admin]# rpm -qa | grep ^puppet puppet-server-0.24.7-4.el5 puppet-0.24.7-4.el5 [root@admin]# rpm -ql puppet puppet-server | grep puppetlast [root@admin]# find /usr -name "puppetlast" [root@admin]# Pete On Tue, Sep 22, 2009 at 12:45 PM, Jeff Adams <jeff.adams@kw.com> wrote:> > We had this issue while we were using webrick and ended up with the > following in cron running every 15 minutes: > > if [ `/usr/bin/puppetlast |grep -v puppetlast |sort -n -k 4 |head -n > 1|awk ''{print \$4}''` -ge 15 ]; then /etc/init.d/puppetmaster restart ;fi > > - Jeff > > On 09/22/2009 02:38 PM, Pete Emerson wrote: >> >> I''m using 0.24 with Webrick (in the process of migrating to 0.25 / passenger). >> >> Occasionally, the puppetmasterd becomes unavailable, and we see error >> messages along the lines of: >> >> Could not call puppetmaster.getconfig: #<Errno::ECONNRESET: Connection >> reset by peer> >> >> I believe the puppetmasterd does not completely die, so it is still in >> the process list. >> >> I''m wondering what a good way to monitor this would be. >> >> I see that I can telnet into port 8140, is there something simple I >> can send that would give me an indication that everything is okay or >> not? >> >> Any suggestions on monitoring this would be appreciated. >> >> Pete >> >> > > > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Ah. I grabbed puppetlast from http://reductivelabs.com/trac/puppet/attachment/ticket/1188/puppetlast It doesn''t work with 0.24.7-4: # ./puppetlast puppetlast error: undefined method `version'' for #<Puppet::Node::Facts:0x2aaaac0b8958> However, it''s pretty clear (even to my untrained ruby eyes) what this script does, so it looks like what I can do is simply check the last modified time of the last yaml file like this: stat `ls -tr /var/lib/puppet/yaml/facts/*.yaml | tail -n1` and then check to make sure that''s within reasonable limits. Pete On Tue, Sep 22, 2009 at 12:55 PM, Pete Emerson <pemerson@gmail.com> wrote:> Sounds promising. Did puppetlast appear after 0.24.7-4? I don''t see it > on my box. > > [root@admin]# rpm -qa | grep ^puppet > puppet-server-0.24.7-4.el5 > puppet-0.24.7-4.el5 > [root@admin]# rpm -ql puppet puppet-server | grep puppetlast > [root@admin]# find /usr -name "puppetlast" > [root@admin]# > > Pete > > On Tue, Sep 22, 2009 at 12:45 PM, Jeff Adams <jeff.adams@kw.com> wrote: >> >> We had this issue while we were using webrick and ended up with the >> following in cron running every 15 minutes: >> >> if [ `/usr/bin/puppetlast |grep -v puppetlast |sort -n -k 4 |head -n >> 1|awk ''{print \$4}''` -ge 15 ]; then /etc/init.d/puppetmaster restart ;fi >> >> - Jeff >> >> On 09/22/2009 02:38 PM, Pete Emerson wrote: >>> >>> I''m using 0.24 with Webrick (in the process of migrating to 0.25 / passenger). >>> >>> Occasionally, the puppetmasterd becomes unavailable, and we see error >>> messages along the lines of: >>> >>> Could not call puppetmaster.getconfig: #<Errno::ECONNRESET: Connection >>> reset by peer> >>> >>> I believe the puppetmasterd does not completely die, so it is still in >>> the process list. >>> >>> I''m wondering what a good way to monitor this would be. >>> >>> I see that I can telnet into port 8140, is there something simple I >>> can send that would give me an indication that everything is okay or >>> not? >>> >>> Any suggestions on monitoring this would be appreciated. >>> >>> Pete >>> >>> > >> >> >> >> >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
We''ve got over 150 hosts hitting the one puppetmaster, and based on what I''ve seen via searching it seems like we''re hitting into scalability issues with Webrick, and the recommendation is to switch to Mongrel or Passenger. Looks to me like Passenger is where the focus is, so I''m working on migrating to 0.25 and Passenger, with multiple master nodes for redundancy and scalability. Pete On Tue, Sep 22, 2009 at 12:41 PM, Clint Savage <herlo1@gmail.com> wrote:> > On Tue, Sep 22, 2009 at 1:38 PM, Pete Emerson <pemerson@gmail.com> wrote: >> >> I''m using 0.24 with Webrick (in the process of migrating to 0.25 / passenger). >> >> Occasionally, the puppetmasterd becomes unavailable, and we see error >> messages along the lines of: >> >> Could not call puppetmaster.getconfig: #<Errno::ECONNRESET: Connection >> reset by peer> >> >> I believe the puppetmasterd does not completely die, so it is still in >> the process list. >> >> I''m wondering what a good way to monitor this would be. >> >> I see that I can telnet into port 8140, is there something simple I >> can send that would give me an indication that everything is okay or >> not? >> >> Any suggestions on monitoring this would be appreciated. >> >> Pete > > Strangely enough, we had a similar situation here. Whenever Puppet > (the client) would reload its own configs, it would die and not start > again. So we wrote a nice little nagios script that monitored whether > puppet was running on each machine. From that, we could restart or > send an email or anything we want really. > > Cheers, > > Clint > > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Pete Emerson <pemerson@gmail.com> wrote:> We''ve got over 150 hosts hitting the one puppetmaster, and based on > what I''ve seen via searching it seems like we''re hitting into > scalability issues with Webrick, and the recommendation is to switch > to Mongrel or Passenger. Looks to me like Passenger is where the focus > is, so I''m working on migrating to 0.25 and Passenger, with multiple > master nodes for redundancy and scalability.FWIW, we were also seeing our 0.24 puppetmaster stop responding from time to time, requiring a restart. Since upgrading to 0.25.1rc1, reliability has improved a lot, and the load decreased too, while still using the puppetmaster with the included webrick. Matthias --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---